Javascript must be enabled to continue!
Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
View through CrossRef
Background: Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time. Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, and demographic parameters (such as population growth rates), while naturally integrating uncertainty in all parameters. Despite their desirable properties, Bayesian approaches can be computationally intensive, hindering their use for outbreak investigations involving genome data for a large numbers of pathogen isolates. An alternative to using full Bayesian inference is to use a hybrid approach, where the phylogenetic tree and evolutionary timescale are estimated first using maximum likelihood. Under this hybrid approach, demographic parameters are inferred from estimated trees instead of the sequence data, using maximum likelihood, Bayesian inference, or approximate Bayesian computation. This can vastly reduce the computational burden, but has the disadvantage of ignoring the uncertainty in the phylogenetic tree and evolutionary timescale.
Results: We compared the performance of a fully Bayesian and a hybrid method by analysing six whole-genome SNP data sets from a range of bacteria and simulations. The estimates from the two methods were very similar, suggesting that the hybrid method is a valid alternative for very large datasets. However, we also found that congruence between these methods is contingent on the presence of strong temporal structure in the data (i.e. clocklike behaviour), which is typically verified using a date-randomisation test in a Bayesian framework. To reduce the computational burden of this Bayesian test we implemented a date-randomisation test using a rapid maximum likelihood method, which has similar performance to its Bayesian counterpart.
Conclusions: Hybrid approaches can produce reliable inferences of evolutionary timescales and phylodynamic parameters in a fraction of the time required for fully Bayesian analyses. As such, they are a valuable alternative in outbreak studies involving a large number of isolates.
Title: Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
Description:
Background: Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time.
Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, and demographic parameters (such as population growth rates), while naturally integrating uncertainty in all parameters.
Despite their desirable properties, Bayesian approaches can be computationally intensive, hindering their use for outbreak investigations involving genome data for a large numbers of pathogen isolates.
An alternative to using full Bayesian inference is to use a hybrid approach, where the phylogenetic tree and evolutionary timescale are estimated first using maximum likelihood.
Under this hybrid approach, demographic parameters are inferred from estimated trees instead of the sequence data, using maximum likelihood, Bayesian inference, or approximate Bayesian computation.
This can vastly reduce the computational burden, but has the disadvantage of ignoring the uncertainty in the phylogenetic tree and evolutionary timescale.
Results: We compared the performance of a fully Bayesian and a hybrid method by analysing six whole-genome SNP data sets from a range of bacteria and simulations.
The estimates from the two methods were very similar, suggesting that the hybrid method is a valid alternative for very large datasets.
However, we also found that congruence between these methods is contingent on the presence of strong temporal structure in the data (i.
e.
clocklike behaviour), which is typically verified using a date-randomisation test in a Bayesian framework.
To reduce the computational burden of this Bayesian test we implemented a date-randomisation test using a rapid maximum likelihood method, which has similar performance to its Bayesian counterpart.
Conclusions: Hybrid approaches can produce reliable inferences of evolutionary timescales and phylodynamic parameters in a fraction of the time required for fully Bayesian analyses.
As such, they are a valuable alternative in outbreak studies involving a large number of isolates.
Related Results
Sample-efficient Optimization Using Neural Networks
Sample-efficient Optimization Using Neural Networks
<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligibl...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Inferring Phylogenetic Networks Using PhyloNet
Inferring Phylogenetic Networks Using PhyloNet
Abstract
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalit...
On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo
On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo
Abstract
For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis c...
PaNDA: Efficient Optimization of Phylogenetic Diversity in Networks
PaNDA: Efficient Optimization of Phylogenetic Diversity in Networks
Abstract
Phylogenetic diversity plays an important role in biodiversity, conservation, and evolutionary studies by measuring the diversity of a s...
Evolution of Antimicrobial Resistance in Community vs. Hospital-Acquired Infections
Evolution of Antimicrobial Resistance in Community vs. Hospital-Acquired Infections
Abstract
Introduction
Hospitals are high-risk environments for infections. Despite the global recognition of these pathogens, few studies compare microorganisms from community-acqu...
Phylogenetic overdispersion of plant species in southern Brazilian savannas
Phylogenetic overdispersion of plant species in southern Brazilian savannas
Ecological communities are the result of not only present ecological processes, such as competition among species and environmental filtering, but also past and continuing evolutio...
Accuracy and computational efficiency of genomic selection with high-density SNP and whole-genome sequence data.
Accuracy and computational efficiency of genomic selection with high-density SNP and whole-genome sequence data.
Abstract
The prediction of complex or quantitative traits from single nucleotide polymorphism (SNP) genotypes has transformed livestock and plant breeding, and is...

