Javascript must be enabled to continue!
How imputation can mitigate SNP ascertainment Bias
View through CrossRef
Abstract
Background
Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays. The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heterozygosity and genetic distances relative to whole genome sequencing (WGS) data is known as SNP ascertainment bias. Full correction for this bias requires detailed knowledge of the array design process, which is often not available in practice. This study suggests an alternative approach to mitigate ascertainment bias of a large set of genotyped individuals by using information of a small set of sequenced individuals via imputation without the need for prior knowledge on the array design.
Results
The strategy was first tested by simulating additional ascertainment bias with a set of 1566 chickens from 74 populations that were genotyped for the positions of the Affymetrix Axiom™ 580 k Genome-Wide Chicken Array. Imputation accuracy was shown to be consistently higher for populations used for SNP discovery during the simulated array design process. Reference sets of at least one individual per population in the study set led to a strong correction of ascertainment bias for estimates of expected and observed heterozygosity, Wright’s Fixation Index and Nei’s Standard Genetic Distance. In contrast, unbalanced reference sets (overrepresentation of populations compared to the study set) introduced a new bias towards the reference populations. Finally, the array genotypes were imputed to WGS by utilization of reference sets of 74 individuals (one per population) to 98 individuals (additional commercial chickens) and compared with a mixture of individually and pooled sequenced populations. The imputation reduced the slope between heterozygosity estimates of array data and WGS data from 1.94 to 1.26 when using the smaller balanced reference panel and to 1.44 when using the larger but unbalanced reference panel. This generally supported the results from simulation but was less favorable, advocating for a larger reference panel when imputing to WGS.
Conclusions
The results highlight the potential of using imputation for mitigation of SNP ascertainment bias but also underline the need for unbiased reference sets.
Springer Science and Business Media LLC
Title: How imputation can mitigate SNP ascertainment Bias
Description:
Abstract
Background
Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays.
The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heterozygosity and genetic distances relative to whole genome sequencing (WGS) data is known as SNP ascertainment bias.
Full correction for this bias requires detailed knowledge of the array design process, which is often not available in practice.
This study suggests an alternative approach to mitigate ascertainment bias of a large set of genotyped individuals by using information of a small set of sequenced individuals via imputation without the need for prior knowledge on the array design.
Results
The strategy was first tested by simulating additional ascertainment bias with a set of 1566 chickens from 74 populations that were genotyped for the positions of the Affymetrix Axiom™ 580 k Genome-Wide Chicken Array.
Imputation accuracy was shown to be consistently higher for populations used for SNP discovery during the simulated array design process.
Reference sets of at least one individual per population in the study set led to a strong correction of ascertainment bias for estimates of expected and observed heterozygosity, Wright’s Fixation Index and Nei’s Standard Genetic Distance.
In contrast, unbalanced reference sets (overrepresentation of populations compared to the study set) introduced a new bias towards the reference populations.
Finally, the array genotypes were imputed to WGS by utilization of reference sets of 74 individuals (one per population) to 98 individuals (additional commercial chickens) and compared with a mixture of individually and pooled sequenced populations.
The imputation reduced the slope between heterozygosity estimates of array data and WGS data from 1.
94 to 1.
26 when using the smaller balanced reference panel and to 1.
44 when using the larger but unbalanced reference panel.
This generally supported the results from simulation but was less favorable, advocating for a larger reference panel when imputing to WGS.
Conclusions
The results highlight the potential of using imputation for mitigation of SNP ascertainment bias but also underline the need for unbiased reference sets.
Related Results
How Imputation Can Mitigate Ascertainment Bias
How Imputation Can Mitigate Ascertainment Bias
Abstract
Background Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in th...
Novel design of imputation-enabled SNP arrays for breeding and research applications supporting multi-species hybridisation
Novel design of imputation-enabled SNP arrays for breeding and research applications supporting multi-species hybridisation
AbstractArray-based SNP genotyping platforms have low genotype error and missing data rates compared to genotyping-by-sequencing technologies. However, design decisions used to cre...
Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
Abstract
Background
For assembling large whole-genome sequence datasets to be used routinely in research and breeding, the sequ...
Hubungan antara SNP rs3761863 terhadap kejadian reaksi reversal pada pasien MH tipe borderline di RSUP Prof. Dr. I.G.N.G. Ngoerah
Hubungan antara SNP rs3761863 terhadap kejadian reaksi reversal pada pasien MH tipe borderline di RSUP Prof. Dr. I.G.N.G. Ngoerah
Introduction: Reversal reaction (RR) is one of the morbidity burdens for Hansen's disease (MH) patients undergoing multi-drug therapy. Some risk factors for RR include age, stress,...
LmTag: functional-enrichment and imputation-aware tag SNP selection for population-specific genotyping arrays
LmTag: functional-enrichment and imputation-aware tag SNP selection for population-specific genotyping arrays
Abstract
Despite the rapid development of sequencing technology, single-nucleotide polymorphism (SNP) arrays are still the most cost-effective genotyping solution...
Genotype Imputation
Genotype Imputation
Abstract
A missing data problem arises in genetic epidemiological studies when genotypes of particular markers are unavailable fo...
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
Abstract
Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Imp...
Characterization and Preparation of Sago Starch (SS) Based Reinforced with Silver Nanoparticle (SNP)
Characterization and Preparation of Sago Starch (SS) Based Reinforced with Silver Nanoparticle (SNP)
This paper reported on the properties of sago starch (SS) films impregnated with different concentration of sliver nanoparticles (SNP) of 100, 2000, 5000 rpm with weight ratio of 1...

