Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Estimating statistical significance of local protein profile-profile alignments

View through CrossRef
Alignment of sequence families described by profiles provides a sensitive means for establishing homology between proteins and is important in protein evolutionary, structural, and functional studies. In the context of a steadily growing amount of sequence data, estimating the statistical significance of alignments, including profile-profile alignments, plays a key role in alignment-based homology search algorithms. Still, it is an open question as to what and whether one type of distribution governs profile-profile alignment score, especially when profile-profile substitution scores involve such terms as secondary structure predictions. This study presents a methodology for estimating the statistical significance of this type of alignments. The methodology rests on a new algorithm developed for generating random profiles such that their alignment scores are distributed similarly to those obtained for real unrelated profiles. We show that improvements in statistical accuracy and sensitivity and high-quality alignment rate result from statistically characterizing alignments by establishing the dependence of statistical parameters on various measures associated with both individual and pairwise profile characteristics. Implemented in the COMER software, the proposed methodology yielded an increase of up to 34.2% in the number of true positives and up to 61.8% in the number of high-quality alignments with respect to the previous version of the COMER method. A new version (v1.5.1) of the COMER software is available at https://sourceforge.net/projects/comer. The COMER software is also available on Github at https://github.com/minmarg/comer and as a Docker image (https://hub.docker.com/r/minmar/comer).
Cold Spring Harbor Laboratory
Title: Estimating statistical significance of local protein profile-profile alignments
Description:
Alignment of sequence families described by profiles provides a sensitive means for establishing homology between proteins and is important in protein evolutionary, structural, and functional studies.
In the context of a steadily growing amount of sequence data, estimating the statistical significance of alignments, including profile-profile alignments, plays a key role in alignment-based homology search algorithms.
Still, it is an open question as to what and whether one type of distribution governs profile-profile alignment score, especially when profile-profile substitution scores involve such terms as secondary structure predictions.
This study presents a methodology for estimating the statistical significance of this type of alignments.
The methodology rests on a new algorithm developed for generating random profiles such that their alignment scores are distributed similarly to those obtained for real unrelated profiles.
We show that improvements in statistical accuracy and sensitivity and high-quality alignment rate result from statistically characterizing alignments by establishing the dependence of statistical parameters on various measures associated with both individual and pairwise profile characteristics.
Implemented in the COMER software, the proposed methodology yielded an increase of up to 34.
2% in the number of true positives and up to 61.
8% in the number of high-quality alignments with respect to the previous version of the COMER method.
A new version (v1.
5.
1) of the COMER software is available at https://sourceforge.
net/projects/comer.
The COMER software is also available on Github at https://github.
com/minmarg/comer and as a Docker image (https://hub.
docker.
com/r/minmar/comer).

Related Results

COFFEE: an objective function for multiple sequence alignments.
COFFEE: an objective function for multiple sequence alignments.
Abstract MOTIVATION: In order to increase the accuracy of multiple sequence alignments, we designed a new strategy for optimizing multiple sequence alignments by gen...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Multiple Alignments of Data Objects and Generalized Center Star Algorithm
Multiple Alignments of Data Objects and Generalized Center Star Algorithm
Multiple alignments of strings have been extensively studied as an effective tool to study string-type data such as DNA. In this paper, we generalize the notion of multiple alignme...
Evaluation of driver visual demand at different design speeds on complex two-dimensional rural highway alignments
Evaluation of driver visual demand at different design speeds on complex two-dimensional rural highway alignments
Road crashes are a major cause of loss of human life, property and money throughout the world. One of the reasons behind these crashes is the interaction between drivers and road a...
Profile–profile methods provide improved fold‐recognition: A study of different profile–profile alignment methods
Profile–profile methods provide improved fold‐recognition: A study of different profile–profile alignment methods
AbstractTo improve the detection of related proteins, it is often useful to include evolutionary information for both the query and target proteins. One method to include this info...
Scoring alignments by embedding vector similarity
Scoring alignments by embedding vector similarity
AbstractSequence similarity is of paramount importance in biology, as similar sequences tend to have similar function and share common ancestry. Scoring matrices, such as PAM or BL...
Microwave Ablation with or Without Chemotherapy in Management of Non-Small Cell Lung Cancer: A Systematic Review
Microwave Ablation with or Without Chemotherapy in Management of Non-Small Cell Lung Cancer: A Systematic Review
Abstract Introduction  Microwave ablation (MWA) has emerged as a minimally invasive treatment for patients with inoperable non-small cell lung cancer (NSCLC). However, whether it i...
Steering Protein Fermentation in Pigs
Steering Protein Fermentation in Pigs
Protein fermentation in pigs has been associated with diarrhea through the presence of potentially toxic metabolites, including ammonia, branched chain fatty acids, biogenic amines...

Back to Top