Javascript must be enabled to continue!
Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
View through CrossRef
Abstract
Heterogeneity in definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification. Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that learns and corrects misclassified phenotypes using structured genotype associations within a dataset. PheLEx consists of a hierarchical Bayesian latent variable model, where inference of differential misclassification is accomplished using filtered genotypes while implementing a full mixed model to account for population structure and genetic relatedness in study populations. Through simulations, we show that the PheLEx framework dramatically improves recovery of the correct disease state when considering realistic allele effect sizes compared to existing methodologies designed for Bayesian recovery of disease phenotypes. We also demonstrate the potential of PheLEx for extracting new candidate loci from existing GWAS data by analyzing epilepsy and bipolar disorder phenotypes available from the UK Biobank dataset, where we identify new candidate disease loci not previously reported for these datasets that have biological connections to the disease phenotypes and/or were identified in independent GWAS. In the discussion, we consider both the broader consequences and importance of careful interpretation of misclassification correction in GWAS phenotypes, as well as potential of PheLEx for re-analyzing existing GWAS data to make novel discoveries.
Author Summary
Prevalent misdiagnosis of diseases due to lack of understanding and/or gold-standard diagnostic measures can impact any analytics that follow. These misdiagnosis errors are especially significant in the domain of psychiatric or psychological disorders where the definition of disease and/or their diagnostic tools are always in flux or under further improvement. Here, we propose a method to extract misdiagnosis from disease and infer the correct disease phenotype. We examined the performance of this method on rigorous simulations and real disease phenotypes obtained from the UK Biobank database. We found that this method successfully recovered misdiagnosed individuals in simulations using a carefully designed hierarchical Bayesian latent variable model framework. For real disease phenotypes, epilepsy and bipolar disorder, this method not only suggested an alternate phenotype but results from this method were also used to discover new genomic loci that have been previously showed to be associated with the respective phenotypes, suggesting that this method can be further used to reanalyze large-scale genetic datasets to discover novel loci that might be ignored using traditional methodologies.
Title: Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes
Description:
Abstract
Heterogeneity in definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci.
While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification.
Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that learns and corrects misclassified phenotypes using structured genotype associations within a dataset.
PheLEx consists of a hierarchical Bayesian latent variable model, where inference of differential misclassification is accomplished using filtered genotypes while implementing a full mixed model to account for population structure and genetic relatedness in study populations.
Through simulations, we show that the PheLEx framework dramatically improves recovery of the correct disease state when considering realistic allele effect sizes compared to existing methodologies designed for Bayesian recovery of disease phenotypes.
We also demonstrate the potential of PheLEx for extracting new candidate loci from existing GWAS data by analyzing epilepsy and bipolar disorder phenotypes available from the UK Biobank dataset, where we identify new candidate disease loci not previously reported for these datasets that have biological connections to the disease phenotypes and/or were identified in independent GWAS.
In the discussion, we consider both the broader consequences and importance of careful interpretation of misclassification correction in GWAS phenotypes, as well as potential of PheLEx for re-analyzing existing GWAS data to make novel discoveries.
Author Summary
Prevalent misdiagnosis of diseases due to lack of understanding and/or gold-standard diagnostic measures can impact any analytics that follow.
These misdiagnosis errors are especially significant in the domain of psychiatric or psychological disorders where the definition of disease and/or their diagnostic tools are always in flux or under further improvement.
Here, we propose a method to extract misdiagnosis from disease and infer the correct disease phenotype.
We examined the performance of this method on rigorous simulations and real disease phenotypes obtained from the UK Biobank database.
We found that this method successfully recovered misdiagnosed individuals in simulations using a carefully designed hierarchical Bayesian latent variable model framework.
For real disease phenotypes, epilepsy and bipolar disorder, this method not only suggested an alternate phenotype but results from this method were also used to discover new genomic loci that have been previously showed to be associated with the respective phenotypes, suggesting that this method can be further used to reanalyze large-scale genetic datasets to discover novel loci that might be ignored using traditional methodologies.
Related Results
Valid inference for machine learning-assisted GWAS
Valid inference for machine learning-assisted GWAS
Abstract
Machine learning (ML) has revolutionized analytical strategies in almost all scientific disciplines including human genetics and genomics. Due to challenge...
e-GRASP: an integrated evolutionary and GRASP resource for exploring disease associations
e-GRASP: an integrated evolutionary and GRASP resource for exploring disease associations
Abstract
Background
Genome-wide association studies (GWAS) have become a mainstay of biological research concerned with d...
Epidemiological, diagnostic and medical-social aspects of latent syphilis
Epidemiological, diagnostic and medical-social aspects of latent syphilis
Objective — to study epidemiological, clinical and medical-social aspects of latent syphilis in Ukraine over the past 40 years.
Materials and methods. Data of patients with latent ...
Sample-efficient Optimization Using Neural Networks
Sample-efficient Optimization Using Neural Networks
<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligibl...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Linking GWAS to pharmacological treatments for psychiatric disorders
Linking GWAS to pharmacological treatments for psychiatric disorders
Abstract
Importance
Large-scale genome-wide association studies (GWASs) are expected to inform the development of pharmacologic...
Pengaruh Kepemimpinan Kepala Sekolah, Lingkungan Kerja, dan Sarana Pembelajaran terhadap Kinerja Guru Melalui Motivasi Kerja
Pengaruh Kepemimpinan Kepala Sekolah, Lingkungan Kerja, dan Sarana Pembelajaran terhadap Kinerja Guru Melalui Motivasi Kerja
Penelitian ini mengkaji pengaruh kepemimpinan kepala sekolah, lingkungan sekolah, dan sarana pembelajaran terhadap kinerja guru SMAS Reformasi Plus, dengan motivasi guru sebagai va...

