Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Uncovering footprints of natural selection through time-frequency analysis of genomic summary statistics

View through CrossRef
Abstract Natural selection leaves a spatial pattern along the genome, with a distortion in the haplotype distribution near the selected locus that becomes less prominent with increasing distance from the locus. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Different summary statistics highlight diverse components of genetic variation and, therefore, considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that jointly consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and contemporary deep learning architectures. However, better predictions may be attainable by improving the way in which features used as input to machine learning algorithms are extracted from these summary statistics. To achieve this goal, we apply three time-frequency analysis approaches (wavelet transform, multitaper spectral analysis, and S-transform) to summary statistic arrays. Each analysis method converts a one-dimensional summary statistic arrays to a two-dimensional image of spectral density or visual representation of time-frequency analysis, permitting the simultaneous assessment of temporal and spectral information. We use these images as input to convolutional neural networks and consider combining models across different time-frequency representation approaches through the ensemble stacking technique. Application of our modeling framework to data simulated from neutral and selective sweep scenarios reveals that it achieves almost perfect accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets for which sweep strength, softness, and timing parameters were drawn from a wide range. Moreover, a scan of whole-genome sequencing of central European humans recapitulated previous well-established sweep candidates, as well as predicts novel cancer associated genes as sweeps with high support. Given that this modeling framework is also robust to missing data, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.
Title: Uncovering footprints of natural selection through time-frequency analysis of genomic summary statistics
Description:
Abstract Natural selection leaves a spatial pattern along the genome, with a distortion in the haplotype distribution near the selected locus that becomes less prominent with increasing distance from the locus.
Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality.
Different summary statistics highlight diverse components of genetic variation and, therefore, considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection.
In recent years, numerous methods have been devised that jointly consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and contemporary deep learning architectures.
However, better predictions may be attainable by improving the way in which features used as input to machine learning algorithms are extracted from these summary statistics.
To achieve this goal, we apply three time-frequency analysis approaches (wavelet transform, multitaper spectral analysis, and S-transform) to summary statistic arrays.
Each analysis method converts a one-dimensional summary statistic arrays to a two-dimensional image of spectral density or visual representation of time-frequency analysis, permitting the simultaneous assessment of temporal and spectral information.
We use these images as input to convolutional neural networks and consider combining models across different time-frequency representation approaches through the ensemble stacking technique.
Application of our modeling framework to data simulated from neutral and selective sweep scenarios reveals that it achieves almost perfect accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets for which sweep strength, softness, and timing parameters were drawn from a wide range.
Moreover, a scan of whole-genome sequencing of central European humans recapitulated previous well-established sweep candidates, as well as predicts novel cancer associated genes as sweeps with high support.
Given that this modeling framework is also robust to missing data, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.

Related Results

Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Objective: To determine the frequency of common chromosomal aberrations in local population idiopathic determine the frequency of common chromosomal aberrations in local population...
Predictors of Statistics Anxiety Among Graduate Students in Saudi Arabia
Predictors of Statistics Anxiety Among Graduate Students in Saudi Arabia
Problem The problem addressed in this study is the anxiety experienced by graduate students toward statistics courses, which often causes students to delay taking statistics cours...
Selection Gradients
Selection Gradients
Natural selection and sexual selection are important evolutionary processes that can shape the phenotypic distributions of natural populations and, consequently, a primary goal of ...
Poems
Poems
poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poem...
Study of geometry footprints origin from the physical modeling data
Study of geometry footprints origin from the physical modeling data
Acquisition footprint is a new concept of seismic noise, which appears in 90s of last century when seismic exploration has carried out in complex area. It was called “Acquisition a...
New cycle, same old mistakes? Overlapping vs. discrete generations in long-term recurrent selection
New cycle, same old mistakes? Overlapping vs. discrete generations in long-term recurrent selection
Abstract Background Recurrent selection is a foundational breeding method for quantitative trait improvement. It typically feat...
Genomic selection and its importance in animal breeding and genetic improvement revolution: A comprehensive review
Genomic selection and its importance in animal breeding and genetic improvement revolution: A comprehensive review
Genomic selection has emerged as a transformative approach in animal breeding and genetic improvement, revolutionizing the field by enhancing the accuracy and efficiency of selecti...

Back to Top