Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Evolutionary distance estimation and fidelity of pair wise sequence alignment

View through CrossRef
Abstract Background Evolutionary distances are a critical measure in comparative genomics and molecular evolutionary biology. A simulation study was used to examine the effect of alignment accuracy of DNA sequences on evolutionary distance estimation. Results Under the studied conditions, distance estimation was relatively unaffected by alignment error (50% or more of the sites incorrectly aligned) as long as 50% or more of the sites were identical among the sequences (observed P-distance < 0.5). Beyond this threshold, the alignment procedure artificially inflates the apparent sequence identity, skewing distance estimates, and creating alignments that are essentially indistinguishable from random data. This general result was independent of substitution model, sequence length, and insertion and deletion size and rate. Conclusion Examination of the estimated sequence identity may yield some guidance as to the accuracy of the alignment. Inaccurate alignments are expected to have large effects on analyses dependent on site specificity, but analyses that depend on evolutionary distance may be somewhat robust to alignment error as long as fewer than half of the sites have diverged.
Springer Science and Business Media LLC
Title: Evolutionary distance estimation and fidelity of pair wise sequence alignment
Description:
Abstract Background Evolutionary distances are a critical measure in comparative genomics and molecular evolutionary biology.
A simulation study was used to examine the effect of alignment accuracy of DNA sequences on evolutionary distance estimation.
Results Under the studied conditions, distance estimation was relatively unaffected by alignment error (50% or more of the sites incorrectly aligned) as long as 50% or more of the sites were identical among the sequences (observed P-distance < 0.
5).
Beyond this threshold, the alignment procedure artificially inflates the apparent sequence identity, skewing distance estimates, and creating alignments that are essentially indistinguishable from random data.
This general result was independent of substitution model, sequence length, and insertion and deletion size and rate.
Conclusion Examination of the estimated sequence identity may yield some guidance as to the accuracy of the alignment.
Inaccurate alignments are expected to have large effects on analyses dependent on site specificity, but analyses that depend on evolutionary distance may be somewhat robust to alignment error as long as fewer than half of the sites have diverged.

Related Results

Multiple sequence alignment accuracy and evolutionary distance estimation
Multiple sequence alignment accuracy and evolutionary distance estimation
Abstract Background Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence a...
Evolution and the cell
Evolution and the cell
Genotype to phenotype, and back again Evolution is intimately linked to biology at the cellular scale- evolutionary processes act on the very genetic material that is carried and ...
A Novel Multi-Fidelity Surrogate for Turbomachinery Design Optimization
A Novel Multi-Fidelity Surrogate for Turbomachinery Design Optimization
Abstract The design optimization of turbomachinery is a challenging task as it involves expensive black-box problems. The sample-efficient multi-fidelity optimizatio...
Influence of alignment uncertainty on homology and phylogenetic modeling
Influence of alignment uncertainty on homology and phylogenetic modeling
Most evolutionary analyses or structure modeling are based upon pre-estimated multiple sequence alignment (MSA) models. From a computational point of view, it is too complex to est...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Uncertainty Based Optimization Strategy for the Gappy-POD Multi-Fidelity Method
Uncertainty Based Optimization Strategy for the Gappy-POD Multi-Fidelity Method
Abstract In the surrogate model-based optimization of turbine airfoils, often only the prediction values for objective and constraints are employed, without consider...
Ontology Alignment Techniques
Ontology Alignment Techniques
Sometimes the use of a single ontology is not sufficient to cover different vocabularies for the same domain, and it becomes necessary to use several ontologies in order to encompa...
Evolutionary Biomechanics
Evolutionary Biomechanics
Life has diversified on Earth in many stunning ways. Understanding how this diversity arose and has been maintained is a common interest for many evolutionary biologists. One appro...

Back to Top