Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Multiple sequence alignment accuracy and evolutionary distance estimation

View through CrossRef
Abstract Background Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis. This study used sequence simulation to examine the gain in accuracy of adding a third sequence to a pair wise alignment, particularly concentrating on how the phylogenetic position of the additional sequence relative to the first pair changes the accuracy of the initial pair's alignment as well as their estimated evolutionary distance. Results The maximal gain in alignment accuracy was found not when the third sequence is directly intermediate between the initial two sequences, but rather when it perfectly subdivides the branch leading from the root of the tree to one of the original sequences (making it half as close to one sequence as the other). Evolutionary distance estimation in the multiple alignment framework, however, is largely unrelated to alignment accuracy and rather is dependent on the position of the third sequence; the closer the branch leading to the third sequence is to the root of the tree, the larger the estimated distance between the first two sequences. Conclusion The bias in distance estimation appears to be a direct result of the standard greedy progressive algorithm used by many multiple alignment methods. These results have implications for choosing new taxa and genomes to sequence when resources are limited.
Springer Science and Business Media LLC
Title: Multiple sequence alignment accuracy and evolutionary distance estimation
Description:
Abstract Background Sequence alignment is a common tool in bioinformatics and comparative genomics.
It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis.
This study used sequence simulation to examine the gain in accuracy of adding a third sequence to a pair wise alignment, particularly concentrating on how the phylogenetic position of the additional sequence relative to the first pair changes the accuracy of the initial pair's alignment as well as their estimated evolutionary distance.
Results The maximal gain in alignment accuracy was found not when the third sequence is directly intermediate between the initial two sequences, but rather when it perfectly subdivides the branch leading from the root of the tree to one of the original sequences (making it half as close to one sequence as the other).
Evolutionary distance estimation in the multiple alignment framework, however, is largely unrelated to alignment accuracy and rather is dependent on the position of the third sequence; the closer the branch leading to the third sequence is to the root of the tree, the larger the estimated distance between the first two sequences.
Conclusion The bias in distance estimation appears to be a direct result of the standard greedy progressive algorithm used by many multiple alignment methods.
These results have implications for choosing new taxa and genomes to sequence when resources are limited.

Related Results

Evolutionary distance estimation and fidelity of pair wise sequence alignment
Evolutionary distance estimation and fidelity of pair wise sequence alignment
Abstract Background Evolutionary distances are a critical measure in comparative genomics and molecular evolutionary biology. A simu...
Influence of alignment uncertainty on homology and phylogenetic modeling
Influence of alignment uncertainty on homology and phylogenetic modeling
Most evolutionary analyses or structure modeling are based upon pre-estimated multiple sequence alignment (MSA) models. From a computational point of view, it is too complex to est...
Ancestral sequence alignment under optimal conditions
Ancestral sequence alignment under optimal conditions
Abstract Background Multiple genome alignment is an important problem in bioinformatics. An important subproblem used by many multiple alignment app...
Evolution and the cell
Evolution and the cell
Genotype to phenotype, and back again Evolution is intimately linked to biology at the cellular scale- evolutionary processes act on the very genetic material that is carried and ...
An Alignment-free Method for Phylogeny Estimation using Maximum Likelihood
An Alignment-free Method for Phylogeny Estimation using Maximum Likelihood
Abstract While alignment has traditionally been the primary approach for establishing homology prior to phylogenetic inference, alignment-free me...
The accuracy of several multiple sequence alignment programs for proteins
The accuracy of several multiple sequence alignment programs for proteins
Abstract Background There have been many algorithms and software programs implemented for the inference of multiple sequence alignments of protei...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...
Magnetic alignment technology for wafer bonding
Magnetic alignment technology for wafer bonding
Purpose Wafer bonding is a key process for 3 D advanced packaging of integrated circuits. It requires very high accuracy for the wafer alignment. To solve the problems of large mov...

Back to Top