Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Multiple sequence alignment accuracy and evolutionary distance estimation

View through CrossRef
Abstract Background Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis. This study used sequence simulation to examine the gain in accuracy of adding a third sequence to a pair wise alignment, particularly concentrating on how the phylogenetic position of the additional sequence relative to the first pair changes the accuracy of the initial pair's alignment as well as their estimated evolutionary distance. Results The maximal gain in alignment accuracy was found not when the third sequence is directly intermediate between the initial two sequences, but rather when it perfectly subdivides the branch leading from the root of the tree to one of the original sequences (making it half as close to one sequence as the other). Evolutionary distance estimation in the multiple alignment framework, however, is largely unrelated to alignment accuracy and rather is dependent on the position of the third sequence; the closer the branch leading to the third sequence is to the root of the tree, the larger the estimated distance between the first two sequences. Conclusion The bias in distance estimation appears to be a direct result of the standard greedy progressive algorithm used by many multiple alignment methods. These results have implications for choosing new taxa and genomes to sequence when resources are limited.
Springer Science and Business Media LLC
Title: Multiple sequence alignment accuracy and evolutionary distance estimation
Description:
Abstract Background Sequence alignment is a common tool in bioinformatics and comparative genomics.
It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis.
This study used sequence simulation to examine the gain in accuracy of adding a third sequence to a pair wise alignment, particularly concentrating on how the phylogenetic position of the additional sequence relative to the first pair changes the accuracy of the initial pair's alignment as well as their estimated evolutionary distance.
Results The maximal gain in alignment accuracy was found not when the third sequence is directly intermediate between the initial two sequences, but rather when it perfectly subdivides the branch leading from the root of the tree to one of the original sequences (making it half as close to one sequence as the other).
Evolutionary distance estimation in the multiple alignment framework, however, is largely unrelated to alignment accuracy and rather is dependent on the position of the third sequence; the closer the branch leading to the third sequence is to the root of the tree, the larger the estimated distance between the first two sequences.
Conclusion The bias in distance estimation appears to be a direct result of the standard greedy progressive algorithm used by many multiple alignment methods.
These results have implications for choosing new taxa and genomes to sequence when resources are limited.

Related Results

Evolutionary distance estimation and fidelity of pair wise sequence alignment
Evolutionary distance estimation and fidelity of pair wise sequence alignment
Abstract Background Evolutionary distances are a critical measure in comparative genomics and molecular evolutionary biology. A simu...
Influence of alignment uncertainty on homology and phylogenetic modeling
Influence of alignment uncertainty on homology and phylogenetic modeling
Most evolutionary analyses or structure modeling are based upon pre-estimated multiple sequence alignment (MSA) models. From a computational point of view, it is too complex to est...
Evolution and the cell
Evolution and the cell
Genotype to phenotype, and back again Evolution is intimately linked to biology at the cellular scale- evolutionary processes act on the very genetic material that is carried and ...
The accuracy of several multiple sequence alignment programs for proteins
The accuracy of several multiple sequence alignment programs for proteins
Abstract Background There have been many algorithms and software programs implemented for the inference of multiple sequence alignments of protei...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Evolutionary Biomechanics
Evolutionary Biomechanics
Life has diversified on Earth in many stunning ways. Understanding how this diversity arose and has been maintained is a common interest for many evolutionary biologists. One appro...
Refined Evolutionary Trees Through an Exceptionally Compatible Alignment-Substitution Model
Refined Evolutionary Trees Through an Exceptionally Compatible Alignment-Substitution Model
A phylogenetic tree commonly represents evolutionary relationships within a set of protein sequences. Various methods and strategies have been used to improve the accuracy of phylo...
Ontology Alignment Techniques
Ontology Alignment Techniques
Sometimes the use of a single ontology is not sufficient to cover different vocabularies for the same domain, and it becomes necessary to use several ontologies in order to encompa...

Back to Top