Javascript must be enabled to continue!
Bayesian coalescent inference of in-host evolution using Next Generation Sequencing
View through CrossRef
AbstractWithin an infected individual, influenza virus exists as a heterogeneous population of variants. When representing the viral population as a consensus sequence, information about minority variants is lost. However, using next generation sequencing (NGS), it is possible to identify nucleotide substitutions which segregate at low frequencies in the viral population, and can give insight into the within-host processes that drive the virus’s evolution, and is a step towards understanding the dynamics of the disease. During the course of an infection, mutations may occur, and at each segregating site, the frequency of the derived allele in the population will fluctuate. We develop a method which can use information about the relative frequencies of mutations in NGS data from a viral population sampled at multiple time points, to infer past population dynamics with a Bayesian skyline model. By using coalescent theory, we analytically derive the joint allele frequency spectrum for a population across multiple time points, and relate this to the coalescent intervals generated from the skyline model. We demonstrate the model on data taken from populations of equine influenza virus sampled during an infection, and show that it is possible to infer a posterior distribution of effective viral population size through time. We also show how the model can be used to infer the probability that a mutation occurred within-host, as opposed to being an ancestral mutation which occurred prior to infection.Author SummaryWhen a host is infected by a virus, many particles of the infecting agent enter the body of the host. This viral population is composed of many closely related viruses that continue diversifying by mutating while reproducing in the host. New sequencing technologies allow the quantifying of the proportion of the different variants present in the host at a particular time. Unfortunately, the data resulting from such sequencing techniques are difficult to interpret as they consist of many unlinked copies of relatively small fragments of genetic code distributed along the genome of the virus.We designed a method combining models of virus genealogies and frequency of mutations appearing in the data to reconstruct the variation of the viral population inside the host. It also allows us to time the apparition of particular variants. This could be useful to detect if a particular mutation (e.g. providing drug resistance) has appeared in host or was circulating before. We applied our method to data of within-host evolution of equine influenza.
Title: Bayesian coalescent inference of in-host evolution using Next Generation Sequencing
Description:
AbstractWithin an infected individual, influenza virus exists as a heterogeneous population of variants.
When representing the viral population as a consensus sequence, information about minority variants is lost.
However, using next generation sequencing (NGS), it is possible to identify nucleotide substitutions which segregate at low frequencies in the viral population, and can give insight into the within-host processes that drive the virus’s evolution, and is a step towards understanding the dynamics of the disease.
During the course of an infection, mutations may occur, and at each segregating site, the frequency of the derived allele in the population will fluctuate.
We develop a method which can use information about the relative frequencies of mutations in NGS data from a viral population sampled at multiple time points, to infer past population dynamics with a Bayesian skyline model.
By using coalescent theory, we analytically derive the joint allele frequency spectrum for a population across multiple time points, and relate this to the coalescent intervals generated from the skyline model.
We demonstrate the model on data taken from populations of equine influenza virus sampled during an infection, and show that it is possible to infer a posterior distribution of effective viral population size through time.
We also show how the model can be used to infer the probability that a mutation occurred within-host, as opposed to being an ancestral mutation which occurred prior to infection.
Author SummaryWhen a host is infected by a virus, many particles of the infecting agent enter the body of the host.
This viral population is composed of many closely related viruses that continue diversifying by mutating while reproducing in the host.
New sequencing technologies allow the quantifying of the proportion of the different variants present in the host at a particular time.
Unfortunately, the data resulting from such sequencing techniques are difficult to interpret as they consist of many unlinked copies of relatively small fragments of genetic code distributed along the genome of the virus.
We designed a method combining models of virus genealogies and frequency of mutations appearing in the data to reconstruct the variation of the viral population inside the host.
It also allows us to time the apparition of particular variants.
This could be useful to detect if a particular mutation (e.
g.
providing drug resistance) has appeared in host or was circulating before.
We applied our method to data of within-host evolution of equine influenza.
Related Results
Robust Design for Coalescent Model Inference
Robust Design for Coalescent Model Inference
Abstract
—The coalescent process describes how changes in the size of a population influence the genealogical patterns of sequences sampled from that population. Th...
Sample-efficient Optimization Using Neural Networks
Sample-efficient Optimization Using Neural Networks
<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligibl...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Next Generation Sequencing Technologies and Their Applications
Next Generation Sequencing Technologies and Their Applications
Abstract
The advances in next generation sequencing (NGS) technologies have tremendous impacts on the studies of structural and f...
Likelihood of social-ecological genetic model
Likelihood of social-ecological genetic model
AbstractThe genetic structure of populations depends on two parallel processes - genetic and social-ecological - providing mutual information. Models that describe species’ respons...
On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo
On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo
Abstract
For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis c...
Extended Bayesian inference incorporating symmetry bias
Extended Bayesian inference incorporating symmetry bias
AbstractIn this study, we start by proposing a causal induction model that incorporates symmetry bias. This model has two parameters that control the strength of symmetry bias and ...
Inferring Phylogenetic Networks Using PhyloNet
Inferring Phylogenetic Networks Using PhyloNet
AbstractPhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet c...

