Javascript must be enabled to continue!
Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
View through CrossRef
Abstract
Background
For assembling large whole-genome sequence datasets to be used routinely in research and breeding, the sequencing strategy should be adapted to the methods that will later be used for variant discovery and imputation. In this study we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method well-suited for large livestock populations.
Methods
We simulated marker array and whole-genome sequence data for fifteen populations with simulated or real pedigrees that had different structures. In these populations we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage. For each population we considered four levels of investment in sequencing that were proportional to the size of the population.
Results
Imputation accuracy largely depended on pedigree depth. The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence. Additionally, it was critical to balance high imputation accuracy in early generations as well as in late generations. Imputation accuracy was highest with a uniform coverage across the sequenced individuals of around 2x rather than variable coverage. An investment equivalent to the cost of sequencing 2% of the population at 2x provided high imputation accuracy. The gain in imputation accuracy from additional investment diminished with larger populations and larger levels of investment. However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones.
Conclusions
Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing around 2% of the population at a uniform coverage around 2x, distributed preferably from the third generation of the pedigree onwards. Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals.
Title: Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
Description:
Abstract
Background
For assembling large whole-genome sequence datasets to be used routinely in research and breeding, the sequencing strategy should be adapted to the methods that will later be used for variant discovery and imputation.
In this study we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method well-suited for large livestock populations.
Methods
We simulated marker array and whole-genome sequence data for fifteen populations with simulated or real pedigrees that had different structures.
In these populations we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage.
For each population we considered four levels of investment in sequencing that were proportional to the size of the population.
Results
Imputation accuracy largely depended on pedigree depth.
The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence.
Additionally, it was critical to balance high imputation accuracy in early generations as well as in late generations.
Imputation accuracy was highest with a uniform coverage across the sequenced individuals of around 2x rather than variable coverage.
An investment equivalent to the cost of sequencing 2% of the population at 2x provided high imputation accuracy.
The gain in imputation accuracy from additional investment diminished with larger populations and larger levels of investment.
However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones.
Conclusions
Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing around 2% of the population at a uniform coverage around 2x, distributed preferably from the third generation of the pedigree onwards.
Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals.
Related Results
A Dynamic model of cutting cluster motion in carrot peeling machine
A Dynamic model of cutting cluster motion in carrot peeling machine
Carrot peeling machines incorporating multi-blade mechanisms arranged along a circular arc have been developed globally to meet growing market demands through high operational thro...
Modification and performance evaluation of yam peeling machine
Modification and performance evaluation of yam peeling machine
Yam is a versatile crop and plays a vital role in tropical regions, where it can be transformed into various food products. Peeling is an essential step in the processing of yam, a...
A review of mechanical cassava peeling and its adoption by processors
A review of mechanical cassava peeling and its adoption by processors
Cassava has rapidly gained recognition as a very useful crop in Africa and other parts of the world, not just for its consumption domestically, but most importantly for its industr...
Convex hull peeling
Convex hull peeling
Enveloppes convexes pelées
Cette thèse porte sur la construction du convex hull peeling (qu’on pourrait traduire littéralement par enveloppe convexe pelée). Le conv...
Optimasation of Mechanical Cassava Peeling System Parameters
Optimasation of Mechanical Cassava Peeling System Parameters
AbstractThis study focused on investigations of effects of mechanical parameters (peeling speed, cutter length) and handling parameter (cassava tuber length) of a cassava peeling m...
Next Generation Sequencing Technologies and Their Applications
Next Generation Sequencing Technologies and Their Applications
Abstract
The advances in next generation sequencing (NGS) technologies have tremendous impacts on the studies of structural and f...
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
Abstract
Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Imp...
Imputation of Spatially-resolved Transcriptomes by Graph-regularized Tensor Completion
Imputation of Spatially-resolved Transcriptomes by Graph-regularized Tensor Completion
Abstract
High-throughput spatial-transcriptomics RNA sequencing (sptRNA-seq) based on in-situ capturing technologies has recently been developed ...

