Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Generative continuous time model reveals epistatic signatures in protein evolution

View through CrossRef
Abstract Protein evolution is fundamentally shaped by epistasis, where the effect of a mutation depends on the sequence context. As standard phylogenetic methods assume independently evolving sites, there is a need for more complex models based on accurate estimations of the fitness landscape. Good candidates are modern generative models – such as the Potts model – which successfully capture epistatic effects. However, recent work on generative evolutionary models usually use discrete time, making them difficult to integrate with the standard frameworks in evolutionary biology. We introduce a continuous-time sequence evolution model using the Gillespie algorithm and parameterized by a generative Potts model. This approach enables us to simulate realistic, family-specific evolutionary trajectories and allows for direct comparison with independent-site models. Surprisingly, we find that while epistasis significantly slows down evolution, it does not change the average evolutionary rates at individual sites. This is explained by the rate heterogeneity caused by context-dependence: we show that the rate at some positions varies between null to high values depending on the context, while other positions are essentially independent from the context. Finally, we show that epistasis leads to a systematic underestimation bias in the inference of evolutionary distance between sequences. Overall, our work provides a new tool for simulating realistic protein evolution and offers novel insights into the complex interplay between epistasis and evolutionary dynamics. Significance statement Understanding how proteins evolve is central to molecular biology and phylogenetics. Traditional evolutionary models assume that mutations act independently at each position in a sequence. This neglects epistasis — the fact that the effect of a mutation depends on the rest of the sequence — which is known to be ubiquitous in proteins. By simulating protein evolution in continuous time using a generative model, our approach produces realistic sequences and reveals how epistasis shapes evolutionary dynamics. We find that epistasis slows down evolution and can mislead common methods for estimating evolutionary timescales. This work bridges modern generative models of proteins and phylogenetics, providing new tools to better understand molecular evolution.
Title: Generative continuous time model reveals epistatic signatures in protein evolution
Description:
Abstract Protein evolution is fundamentally shaped by epistasis, where the effect of a mutation depends on the sequence context.
As standard phylogenetic methods assume independently evolving sites, there is a need for more complex models based on accurate estimations of the fitness landscape.
Good candidates are modern generative models – such as the Potts model – which successfully capture epistatic effects.
However, recent work on generative evolutionary models usually use discrete time, making them difficult to integrate with the standard frameworks in evolutionary biology.
We introduce a continuous-time sequence evolution model using the Gillespie algorithm and parameterized by a generative Potts model.
This approach enables us to simulate realistic, family-specific evolutionary trajectories and allows for direct comparison with independent-site models.
Surprisingly, we find that while epistasis significantly slows down evolution, it does not change the average evolutionary rates at individual sites.
This is explained by the rate heterogeneity caused by context-dependence: we show that the rate at some positions varies between null to high values depending on the context, while other positions are essentially independent from the context.
Finally, we show that epistasis leads to a systematic underestimation bias in the inference of evolutionary distance between sequences.
Overall, our work provides a new tool for simulating realistic protein evolution and offers novel insights into the complex interplay between epistasis and evolutionary dynamics.
Significance statement Understanding how proteins evolve is central to molecular biology and phylogenetics.
Traditional evolutionary models assume that mutations act independently at each position in a sequence.
This neglects epistasis — the fact that the effect of a mutation depends on the rest of the sequence — which is known to be ubiquitous in proteins.
By simulating protein evolution in continuous time using a generative model, our approach produces realistic sequences and reveals how epistasis shapes evolutionary dynamics.
We find that epistasis slows down evolution and can mislead common methods for estimating evolutionary timescales.
This work bridges modern generative models of proteins and phylogenetics, providing new tools to better understand molecular evolution.

Related Results

Epigenetic epistatic interactions constrain the evolution of gene expression
Epigenetic epistatic interactions constrain the evolution of gene expression
Reduced activity of two genes in combination often has a more detrimental effect than expected. Such epistatic interactions not only occur when genes are mutated but also due to va...
Manual and Machine Learning Approaches for Classifying Real and Forged Signatures—A Comparative Study and Forensic Implications
Manual and Machine Learning Approaches for Classifying Real and Forged Signatures—A Comparative Study and Forensic Implications
ABSTRACTA handwritten signature is one of the forms of a biometric measure that creates an individual identity of the persons to mark their approval related to any document. The ma...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
A low resolution epistasis mapping approach to identify chromosome arm interactions in allohexaploid wheat
A low resolution epistasis mapping approach to identify chromosome arm interactions in allohexaploid wheat
1 Abstract Epistasis is an important contributor to genetic variance, even in inbred populations where it is present as additive by additive interac...
Substitution mutational signatures across pan-squamous cell carcinomas
Substitution mutational signatures across pan-squamous cell carcinomas
Abstract Background Squamous cell carcinoma (SCC) is a highly heterogeneous and aggressive cancer type with significant g...
Learning epistatic gene interactions from perturbation screens
Learning epistatic gene interactions from perturbation screens
A bstract The treatment of complex diseases often relies on combinatorial therapy, a strategy where drugs are ...
A signature-based approach to quantify soil moisture dynamics under contrasting land-uses
A signature-based approach to quantify soil moisture dynamics under contrasting land-uses
Soil moisture signatures provide a promising solution to overcome the difficulty of evaluating soil moisture dynamics in hydrologic models. Soil moisture signatures are metrics tha...
Analysis of the Cross-Study Replicability of Tuberculosis Gene Signatures Using 49 Curated Transcriptomic Datasets
Analysis of the Cross-Study Replicability of Tuberculosis Gene Signatures Using 49 Curated Transcriptomic Datasets
Background Tuberculosis (TB) is the leading cause of infectious disease mortality worldwide. Numerous blood-based gene expression signatures have been proposed in...

Back to Top