Javascript must be enabled to continue!
Remote homology search with hidden Potts models
View through CrossRef
AbstractMost methods for biological sequence homology search and alignment work with primary sequence alone, neglecting higher-order correlations. Recently, statistical physics models called Potts models have been used to infer all-by-all pairwise correlations between sites in deep multiple sequence alignments, and these pairwise couplings have improved 3D structure predictions. Here we extend the use of Potts models from structure prediction to sequence alignment and homology search by developing what we call a hidden Potts model (HPM) that merges a Potts emission process to a generative probability model of insertion and deletion. Because an HPM is incompatible with efficient dynamic programming alignment algorithms, we develop an approximate algorithm based on importance sampling, using simpler probabilistic models as proposal distributions. We test an HPM implementation on RNA structure homology search benchmarks, where we can compare directly to exact alignment methods that capture nested RNA base-pairing correlations (stochastic context-free grammars). HPMs perform promisingly in these proof of principle experiments.Author summaryComputational homology search and alignment tools are used to infer the functions and evolutionary histories of biological sequences. Most widely used tools for sequence homology searches, such as BLAST and HMMER, rely on primary sequence conservation alone. It should be possible to make more powerful search tools by also considering higher-order covariation patterns induced by 3D structure conservation. Recent advances in 3D protein structure prediction have used a class of statistical physics models called Potts models to infer pairwise correlation structure in multiple sequence alignments. However, Potts models assume alignments are given and cannot build new alignments, limiting their use in homology search. We have extended Potts models to include a probability model of insertion and deletion so they can be applied to sequence alignment and remote homology search using a new model we call a hidden Potts model (HPM). Tests of our prototype HPM software show promising results in initial benchmarking experiments, though more work will be needed to use HPMs in practical tools.
Title: Remote homology search with hidden Potts models
Description:
AbstractMost methods for biological sequence homology search and alignment work with primary sequence alone, neglecting higher-order correlations.
Recently, statistical physics models called Potts models have been used to infer all-by-all pairwise correlations between sites in deep multiple sequence alignments, and these pairwise couplings have improved 3D structure predictions.
Here we extend the use of Potts models from structure prediction to sequence alignment and homology search by developing what we call a hidden Potts model (HPM) that merges a Potts emission process to a generative probability model of insertion and deletion.
Because an HPM is incompatible with efficient dynamic programming alignment algorithms, we develop an approximate algorithm based on importance sampling, using simpler probabilistic models as proposal distributions.
We test an HPM implementation on RNA structure homology search benchmarks, where we can compare directly to exact alignment methods that capture nested RNA base-pairing correlations (stochastic context-free grammars).
HPMs perform promisingly in these proof of principle experiments.
Author summaryComputational homology search and alignment tools are used to infer the functions and evolutionary histories of biological sequences.
Most widely used tools for sequence homology searches, such as BLAST and HMMER, rely on primary sequence conservation alone.
It should be possible to make more powerful search tools by also considering higher-order covariation patterns induced by 3D structure conservation.
Recent advances in 3D protein structure prediction have used a class of statistical physics models called Potts models to infer pairwise correlation structure in multiple sequence alignments.
However, Potts models assume alignments are given and cannot build new alignments, limiting their use in homology search.
We have extended Potts models to include a probability model of insertion and deletion so they can be applied to sequence alignment and remote homology search using a new model we call a hidden Potts model (HPM).
Tests of our prototype HPM software show promising results in initial benchmarking experiments, though more work will be needed to use HPMs in practical tools.
Related Results
Reflexive homology
Reflexive homology
Reflexive homology is the homology theory associated to the reflexive crossed simplicial group; one of the fundamental crossed simplicial groups. It is the most general way to exte...
Selection of sequence motifs and generative Hopfield-Potts models for protein families
Selection of sequence motifs and generative Hopfield-Potts models for protein families
Statistical models for families of evolutionary related proteins have recently gained interest: in particular pairwise Potts models, as those inferred by the Direct-Coupling Analys...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract
The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
A note on Khovanov–Rozansky sl2-homology and ordinary Khovanov homology
A note on Khovanov–Rozansky sl2-homology and ordinary Khovanov homology
In this paper we present an explicit isomorphism between Khovanov–Rozansky sl2-homology and ordinary Khovanov homology. This result was originally claimed in Khovanov and Rozansky'...
Search engines and their search strategies: the effective use by Indian academics
Search engines and their search strategies: the effective use by Indian academics
Purpose
– The purpose of this paper is to examine the use of various search engines and meta search engines by Indian academics for retrieving information on the we...
Searching and reporting in Campbell Collaboration systematic reviews: A systematic assessment of current methods
Searching and reporting in Campbell Collaboration systematic reviews: A systematic assessment of current methods
AbstractThe search methods used in systematic reviews provide the foundation for establishing the body of literature from which conclusions are drawn and recommendations made. Sear...
Using Metadata to Understand Search Behavior in Digital Libraries
Using Metadata to Understand Search Behavior in Digital Libraries
This thesis explores how search log analysis can be used to gain a deeper understanding of online search behavior in curated collections by leveraging the metadata. For this, we us...
Measurement And Projection Of Exploration Search Efficiency
Measurement And Projection Of Exploration Search Efficiency
Abstract
The efficiency of exploration is an intuitive concept to the explorationist. Factors that obviously contribute to efficiency include good geological inte...

