Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Parallel Likelihood Calculation for Phylogenetic Comparative Models: the SPLITT C++ Library

View through CrossRef
Abstract Phylogenetic comparative models (PCMs) have been used to study macroevolutionary patterns, to characterize adaptive phenotypic landscapes, to quantify rates of evolution, to measure the heritability of traits, and to test various evolutionary hypotheses. A major obstacle to applying these models has been the complexity of evaluating their likelihood function. Recent works have shown that for many PCMs, the likelihood can be obtained in time proportional to the size of the tree based on post-order tree traversal, also known as pruning . Despite this progress, inferring complex multi-trait PCMs on large trees remains a time-intensive task. Here, we study parallelizing the pruning algorithm as a generic technique for speeding-up PCM-inference. We implement several parallel traversal algorithms in the form of a generic C++ library for Serial and Parallel LIneage Traversal of Trees (SPLITT). Based on SPLITT, we provide examples of parallel likelihood evaluation for several popular PCMs, ranging from a single-trait Brownian motion model to complex multi-trait Ornstein-Uhlenbeck and mixed Gaussian phylogenetic models. Using the phylogenetic Ornstein-Uhlenbeck mixed model (POUMM) as a showcase, we run benchmarks on up to 24 CPU cores, reporting up to an order of magnitude parallel speed-up on simulated balanced and unbalanced trees of up to 100,000 tips with up to 16 traits. Noticing that the parallel speed-up depends on multiple factors, the SPLITT library is capable to automatically select the fastest traversal strategy for a given hardware, tree-topology, and data. Combining SPLITT likelihood calculation with adaptive Metropolis sampling on real data, we show that the time for Bayesian POUMM inference on a tree of 10,000 tips can be reduced from several days to minutes. We conclude that parallel pruning effectively accelerates the likelihood calculation and, thus, the statistical inference of Gaussian phylogenetic models. For time-intensive Bayesian inferences, we recommend combining this technique with adaptive Metropolis sampling. Beyond Gaussian models, the parallel tree traversal can be applied to numerous other models, including discrete trait and birth-death population dynamics models. Currently, SPLITT supports multi-core shared memory architectures, but can be extended to distributed memory architectures as well as graphical processing units.
Title: Parallel Likelihood Calculation for Phylogenetic Comparative Models: the SPLITT C++ Library
Description:
Abstract Phylogenetic comparative models (PCMs) have been used to study macroevolutionary patterns, to characterize adaptive phenotypic landscapes, to quantify rates of evolution, to measure the heritability of traits, and to test various evolutionary hypotheses.
A major obstacle to applying these models has been the complexity of evaluating their likelihood function.
Recent works have shown that for many PCMs, the likelihood can be obtained in time proportional to the size of the tree based on post-order tree traversal, also known as pruning .
Despite this progress, inferring complex multi-trait PCMs on large trees remains a time-intensive task.
Here, we study parallelizing the pruning algorithm as a generic technique for speeding-up PCM-inference.
We implement several parallel traversal algorithms in the form of a generic C++ library for Serial and Parallel LIneage Traversal of Trees (SPLITT).
Based on SPLITT, we provide examples of parallel likelihood evaluation for several popular PCMs, ranging from a single-trait Brownian motion model to complex multi-trait Ornstein-Uhlenbeck and mixed Gaussian phylogenetic models.
Using the phylogenetic Ornstein-Uhlenbeck mixed model (POUMM) as a showcase, we run benchmarks on up to 24 CPU cores, reporting up to an order of magnitude parallel speed-up on simulated balanced and unbalanced trees of up to 100,000 tips with up to 16 traits.
Noticing that the parallel speed-up depends on multiple factors, the SPLITT library is capable to automatically select the fastest traversal strategy for a given hardware, tree-topology, and data.
Combining SPLITT likelihood calculation with adaptive Metropolis sampling on real data, we show that the time for Bayesian POUMM inference on a tree of 10,000 tips can be reduced from several days to minutes.
We conclude that parallel pruning effectively accelerates the likelihood calculation and, thus, the statistical inference of Gaussian phylogenetic models.
For time-intensive Bayesian inferences, we recommend combining this technique with adaptive Metropolis sampling.
Beyond Gaussian models, the parallel tree traversal can be applied to numerous other models, including discrete trait and birth-death population dynamics models.
Currently, SPLITT supports multi-core shared memory architectures, but can be extended to distributed memory architectures as well as graphical processing units.

Related Results

Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
Phylogenetic overdispersion of plant species in southern Brazilian savannas
Phylogenetic overdispersion of plant species in southern Brazilian savannas
Ecological communities are the result of not only present ecological processes, such as competition among species and environmental filtering, but also past and continuing evolutio...
PaNDA: Efficient Optimization of Phylogenetic Diversity in Networks
PaNDA: Efficient Optimization of Phylogenetic Diversity in Networks
Abstract Phylogenetic diversity plays an important role in biodiversity, conservation, and evolutionary studies by measuring the diversity of a s...
Inferring Phylogenetic Networks Using PhyloNet
Inferring Phylogenetic Networks Using PhyloNet
Abstract PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalit...
phyr: An R package for phylogenetic species-distribution modelling in ecological communities
phyr: An R package for phylogenetic species-distribution modelling in ecological communities
SummaryModel-based approaches are increasingly popular in ecological studies. A good example of this trend is the use of joint species distribution models to ask questions about ec...
A Single Channel Thermal-Hydraulic Calculation Module for PWR Pin-by-Pin Wise Coupled Calculation System
A Single Channel Thermal-Hydraulic Calculation Module for PWR Pin-by-Pin Wise Coupled Calculation System
Abstract Due to the strong feedback effect between neutronics and thermal-hydraulics in the core of pressurized water reactors (PWR), neutronics and thermal-hydrauli...
METHODOLOGY OF THE DESIGN CALCULATION OF THE ELECTRO-HYDRAULIC SERVO DRIVE OF TECHNOLOGICAL EQUIPMENT
METHODOLOGY OF THE DESIGN CALCULATION OF THE ELECTRO-HYDRAULIC SERVO DRIVE OF TECHNOLOGICAL EQUIPMENT
Machine-building industries and enterprises for modernization of railway rolling stock are constantly increasing the requirements for the technical and functional char...

Back to Top