Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Inference of complex evolutionary histories of somatic mutations using bayesian stochastic character mapping

View through CrossRef
Abstract Introduction: Phylogenetic inference provides insight into the evolutionary mechanisms that shape hematopoietic progenitor populations over an individual lifetime. By performing whole-genome sequencing (WGS) on isogenic colonies derived from single cells, it is possible to identify somatic mutations and use them to reconstruct phylogenetic trees. Individual somatic mutations can then be stratified to individual branches on the tree, empowering rigorous inferences of evolutionary parameters such as mutation age and fitness effects. Unfortunately, contemporary methods for localizing somatic mutations on the phylogeny are limited by the infinite sites model of evolution, which for simplicity assumes that each mutation occurred only once, precluding the possibility of complex events (e.g., recurrent mutation). While most blood somatic mutations conform to this assumption, a recent survey of 39 hematopoietic phylogenies (N = 9,658 total progenitors) showed that exceptions grow as functions of donor age, number of clones, and environmental exposures such as chemotherapy (Chapman et al. 2025, Nature), thus motivating methodological development. Here, we adapt a Bayesian method called stochastic character mapping (SCM) to relax the assumption of infinite sites and reveal complex histories of hematopoiesis. Methods: SCM is a generalizable Bayesian framework for inferring the number, timing, and placement of character state transitions with respect to a phylogeny. Here, we adapted SCM to infer the origin(s) of somatic mutations given a phylogenetic tree of hematopoietic progenitors. SCM uses simulation to approximate the posterior distribution of evolutionary histories for each genetic variant given, a) the observed genotype states and confidence metrics, b) a phylogeny with branch lengths measured in molecular time, and c) a genotype substitution model. Using the posterior distribution of histories for each variant, SCM yields a posterior probability for every genotype state at every node on the phylogenetic tree, enabling probabilistic assignment of mutation events. Importantly, SCM does not assume the infinite sites model of evolution, permitting recovery of complex evolutionary histories. As a demonstration of our method, we applied SCM to colony-based WGS data from hematopoietic progenitors collected from 3 donors (N = 149 total progenitors; DeBoy et al. 2023, NEJM). Results: In addition to 66,780 somatic single nucleotide polymorphisms (SNPs) exhibiting simple histories of a single mutation events, SCM recovered 426 somatic SNPs with strong evidence (posterior probability ≥ 0.95) of complex evolutionary histories and rejected an additional 377 SNPs initially called as somatic but that are better explained by unsampled germline heterozygosity. Among SNPs with complex evolutionary histories, 51.4% (n = 219/426) were C>T transitions – a common mutation signature associated with aging. Strikingly, 40.6% (n = 173/426) of complex histories were present in >2 contemporary samples, implicating local mutation rate heterogeneity or an association with clonal expansion. Using GENCODE (version 34), we also found that over half (58.2%; n = 248/426) of all somatic SNPs with complex histories could be associated with a gene. We identified at least one somatic SNP from each donor that possessed both a complex evolutionary history and fell within a gene with known relevance to CHIP and/or hematologic malignancy (CUX1 c.e3-8755G>A, JAK2 p.V617F, LRP1B c.e67-3397A>G, and ETNK1 p.N155S). This observation suggests that some previous reports of CHIP-driver-independent oligoclonality may be explained by overconservative models of molecular evolution. Conclusions: SCM offers unprecedented resolution into the complex histories of somatic evolution (e.g., recurrent mutation) in the hematopoietic progenitor population over an individual lifetime. Through reanalysis of clone-based WGS data, we recovered mutations with complex histories in phylogenies from each of three individuals, including variants within genes of known relevance to myeloproliferative disease. Forthcoming applications of our method to an expansive cohort of donors will provide valuable insight into the evolutionary/genetic mechanisms that shape hematopoietic lineages with age, disease, and therapeutic strategies. We anticipate that the relevance of our approach will grow over the coming decades as the scale and resolution of single-cell WGS methods continue to improve.
American Society of Hematology
Title: Inference of complex evolutionary histories of somatic mutations using bayesian stochastic character mapping
Description:
Abstract Introduction: Phylogenetic inference provides insight into the evolutionary mechanisms that shape hematopoietic progenitor populations over an individual lifetime.
By performing whole-genome sequencing (WGS) on isogenic colonies derived from single cells, it is possible to identify somatic mutations and use them to reconstruct phylogenetic trees.
Individual somatic mutations can then be stratified to individual branches on the tree, empowering rigorous inferences of evolutionary parameters such as mutation age and fitness effects.
Unfortunately, contemporary methods for localizing somatic mutations on the phylogeny are limited by the infinite sites model of evolution, which for simplicity assumes that each mutation occurred only once, precluding the possibility of complex events (e.
g.
, recurrent mutation).
While most blood somatic mutations conform to this assumption, a recent survey of 39 hematopoietic phylogenies (N = 9,658 total progenitors) showed that exceptions grow as functions of donor age, number of clones, and environmental exposures such as chemotherapy (Chapman et al.
2025, Nature), thus motivating methodological development.
Here, we adapt a Bayesian method called stochastic character mapping (SCM) to relax the assumption of infinite sites and reveal complex histories of hematopoiesis.
Methods: SCM is a generalizable Bayesian framework for inferring the number, timing, and placement of character state transitions with respect to a phylogeny.
Here, we adapted SCM to infer the origin(s) of somatic mutations given a phylogenetic tree of hematopoietic progenitors.
SCM uses simulation to approximate the posterior distribution of evolutionary histories for each genetic variant given, a) the observed genotype states and confidence metrics, b) a phylogeny with branch lengths measured in molecular time, and c) a genotype substitution model.
Using the posterior distribution of histories for each variant, SCM yields a posterior probability for every genotype state at every node on the phylogenetic tree, enabling probabilistic assignment of mutation events.
Importantly, SCM does not assume the infinite sites model of evolution, permitting recovery of complex evolutionary histories.
As a demonstration of our method, we applied SCM to colony-based WGS data from hematopoietic progenitors collected from 3 donors (N = 149 total progenitors; DeBoy et al.
2023, NEJM).
Results: In addition to 66,780 somatic single nucleotide polymorphisms (SNPs) exhibiting simple histories of a single mutation events, SCM recovered 426 somatic SNPs with strong evidence (posterior probability ≥ 0.
95) of complex evolutionary histories and rejected an additional 377 SNPs initially called as somatic but that are better explained by unsampled germline heterozygosity.
Among SNPs with complex evolutionary histories, 51.
4% (n = 219/426) were C>T transitions – a common mutation signature associated with aging.
Strikingly, 40.
6% (n = 173/426) of complex histories were present in >2 contemporary samples, implicating local mutation rate heterogeneity or an association with clonal expansion.
Using GENCODE (version 34), we also found that over half (58.
2%; n = 248/426) of all somatic SNPs with complex histories could be associated with a gene.
We identified at least one somatic SNP from each donor that possessed both a complex evolutionary history and fell within a gene with known relevance to CHIP and/or hematologic malignancy (CUX1 c.
e3-8755G>A, JAK2 p.
V617F, LRP1B c.
e67-3397A>G, and ETNK1 p.
N155S).
This observation suggests that some previous reports of CHIP-driver-independent oligoclonality may be explained by overconservative models of molecular evolution.
Conclusions: SCM offers unprecedented resolution into the complex histories of somatic evolution (e.
g.
, recurrent mutation) in the hematopoietic progenitor population over an individual lifetime.
Through reanalysis of clone-based WGS data, we recovered mutations with complex histories in phylogenies from each of three individuals, including variants within genes of known relevance to myeloproliferative disease.
Forthcoming applications of our method to an expansive cohort of donors will provide valuable insight into the evolutionary/genetic mechanisms that shape hematopoietic lineages with age, disease, and therapeutic strategies.
We anticipate that the relevance of our approach will grow over the coming decades as the scale and resolution of single-cell WGS methods continue to improve.

Related Results

Implementasi Pembelajaran IPS Sebagai Penguatan Pendidikan Karakter di Sekolah Dasar
Implementasi Pembelajaran IPS Sebagai Penguatan Pendidikan Karakter di Sekolah Dasar
This study aims to analyze the implementation of social studies learning as strengthening character education in elementary schools. The research method used is a qualitative descr...
The adaptive potential of non-heritable somatic mutations
The adaptive potential of non-heritable somatic mutations
Abstract Non-heritable somatic mutations are typically associated with deleterious effects such as in cancer and senescence, so their role in adaptive evolution has...
Sample-efficient Optimization Using Neural Networks
Sample-efficient Optimization Using Neural Networks
<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligibl...
Clinical and Biological Implications of CUX1 Mutations in Myeloid Neoplasms
Clinical and Biological Implications of CUX1 Mutations in Myeloid Neoplasms
Abstract Recurrent somatic mutations of CUX1 are described in myeloid neoplasms. CUX1 is located at chromosome 7q22.1; -7/del(7q) involving CUX1 locus are common abn...
Dynamics of Mutations in Patients with ET Treated with Imetelstat
Dynamics of Mutations in Patients with ET Treated with Imetelstat
Abstract Background: Imetelstat, a first in class specific telomerase inhibitor, induced hematologic responses in all patients (pts) with essential thrombocythemia (...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Small Subclones Harboring NOTCH1, SF3B1 or BIRC3 Mutations Are Clinically Irrelevant in Chronic Lymphocytic Leukemia
Small Subclones Harboring NOTCH1, SF3B1 or BIRC3 Mutations Are Clinically Irrelevant in Chronic Lymphocytic Leukemia
Abstract Introduction. Ultra-deep next generation sequencing (NGS) allows sensitive detection of mutations and estimation of their clonal abundance in tumor cell pop...

Back to Top