Javascript must be enabled to continue!
Generative continuous time model reveals epistatic signatures in protein evolution
View through CrossRef
Abstract
Protein evolution is fundamentally shaped by epistasis, where the effect of a mutation depends on the sequence context. As standard phylogenetic methods assume independently evolving sites, there is a need for more complex models based on accurate estimations of the fitness landscape. Good candidates are modern generative models – such as the Potts model – which successfully capture epistatic effects. However, recent work on generative evolutionary models usually use discrete time, making them difficult to integrate with the standard frameworks in evolutionary biology. We introduce a continuous-time sequence evolution model using the Gillespie algorithm and parameterized by a generative Potts model. This approach enables us to simulate realistic, family-specific evolutionary trajectories and allows for direct comparison with independent-site models. Surprisingly, we find that while epistasis significantly slows down evolution, it does not change the average evolutionary rates at individual sites. This is explained by the rate heterogeneity caused by context-dependence: we show that the rate at some positions varies between null to high values depending on the context, while other positions are essentially independent from the context. Finally, we show that epistasis leads to a systematic underestimation bias in the inference of evolutionary distance between sequences. Overall, our work provides a new tool for simulating realistic protein evolution and offers novel insights into the complex interplay between epistasis and evolutionary dynamics.
Significance statement
Understanding how proteins evolve is central to molecular biology and phylogenetics. Traditional evolutionary models assume that mutations act independently at each position in a sequence. This neglects epistasis — the fact that the effect of a mutation depends on the rest of the sequence — which is known to be ubiquitous in proteins. By simulating protein evolution in continuous time using a generative model, our approach produces realistic sequences and reveals how epistasis shapes evolutionary dynamics. We find that epistasis slows down evolution and can mislead common methods for estimating evolutionary timescales. This work bridges modern generative models of proteins and phylogenetics, providing new tools to better understand molecular evolution.
Title: Generative continuous time model reveals epistatic signatures in protein evolution
Description:
Abstract
Protein evolution is fundamentally shaped by epistasis, where the effect of a mutation depends on the sequence context.
As standard phylogenetic methods assume independently evolving sites, there is a need for more complex models based on accurate estimations of the fitness landscape.
Good candidates are modern generative models – such as the Potts model – which successfully capture epistatic effects.
However, recent work on generative evolutionary models usually use discrete time, making them difficult to integrate with the standard frameworks in evolutionary biology.
We introduce a continuous-time sequence evolution model using the Gillespie algorithm and parameterized by a generative Potts model.
This approach enables us to simulate realistic, family-specific evolutionary trajectories and allows for direct comparison with independent-site models.
Surprisingly, we find that while epistasis significantly slows down evolution, it does not change the average evolutionary rates at individual sites.
This is explained by the rate heterogeneity caused by context-dependence: we show that the rate at some positions varies between null to high values depending on the context, while other positions are essentially independent from the context.
Finally, we show that epistasis leads to a systematic underestimation bias in the inference of evolutionary distance between sequences.
Overall, our work provides a new tool for simulating realistic protein evolution and offers novel insights into the complex interplay between epistasis and evolutionary dynamics.
Significance statement
Understanding how proteins evolve is central to molecular biology and phylogenetics.
Traditional evolutionary models assume that mutations act independently at each position in a sequence.
This neglects epistasis — the fact that the effect of a mutation depends on the rest of the sequence — which is known to be ubiquitous in proteins.
By simulating protein evolution in continuous time using a generative model, our approach produces realistic sequences and reveals how epistasis shapes evolutionary dynamics.
We find that epistasis slows down evolution and can mislead common methods for estimating evolutionary timescales.
This work bridges modern generative models of proteins and phylogenetics, providing new tools to better understand molecular evolution.
Related Results
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Learning epistatic gene interactions from perturbation screens
Learning epistatic gene interactions from perturbation screens
AbstractThe treatment of complex diseases often relies on combinatorial therapy, a strategy where drugs are used to target multiple genes simultaneously. Promising candidate genes ...
Correlation of Mutational Signatures in Cancer Genes with General Signatures
Correlation of Mutational Signatures in Cancer Genes with General Signatures
The occurrence of various mutation patterns, such as changes in the DNA sequence and the loss of some sequences, is called a “mutational signature,” and they represent the molecula...
Blunt Chest Trauma and Chylothorax: A Systematic Review
Blunt Chest Trauma and Chylothorax: A Systematic Review
Abstract
Introduction: Although traumatic chylothorax is predominantly associated with penetrating injuries, instances following blunt trauma, as a rare and challenging condition, ...
Generative Artificial Intelligence and Its Role in Shaping Customer Loyalty in Banking: A Conceptual Framework
Generative Artificial Intelligence and Its Role in Shaping Customer Loyalty in Banking: A Conceptual Framework
The role of Generative Artificial Intelligence (Generative AI) in Electronic Customer Relationship Management (Electronic-CRM) systems is reshaping consumer engagement in the banki...
Steering Protein Fermentation in Pigs
Steering Protein Fermentation in Pigs
Protein fermentation in pigs has been associated with diarrhea through the presence of potentially toxic metabolites, including ammonia, branched chain fatty acids, biogenic amines...
Automatic Daily Drilling Mud Report Processing Using Generative AI to Maximize the Operational Efficiency
Automatic Daily Drilling Mud Report Processing Using Generative AI to Maximize the Operational Efficiency
Abstract
Large service companies process an excessive amount of drilling mud reports daily, requiring engineers to perform labor-intensive, costly, and error-prone m...
A Fast Lasso-Based Method for Inferring Pairwise Interactions
A Fast Lasso-Based Method for Inferring Pairwise Interactions
AbstractLarge-scale genotype-phenotype screens provide a wealth of data for identifying molecular alternations associated with a phenotype. Epistatic effects play an important role...

