Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Protein secondary structure and remote homology detection

View through CrossRef
1AbstractA protein can be represented by its primary, secondary, or tertiary structure. With recent advances in AI, there is now as much tertiary as primary structural data available. Fast and accurate search methods exist for both types of data, with searches over both representations being highly precise. However, primary structure data can sometimes be incomplete. As a result, tertiary structure has become the gold standard for remote homology detection.How does secondary structure perform in remote homology detection? Secondary structure interprets proteins as a sequence using an alphabet representing helices, strands, or loops. It shares its sequential nature with primary structure while retaining topological information similar to tertiary structure.To assess the effectiveness of secondary structure in remote homology detection, we devised a challenging classification task aimed at determining the superfamily membership of very distantly related protein domains. We used benchmarks from the CATH and SCOP databases and evaluated sequence and structure alignment algorithms on primary, secondary, and tertiary structures.As expected, both basic and advanced sequence alignment algorithms applied to primary structure achieved high precision, but their overall area under the curve was lower compared to the gold standard of structural alignment using tertiary structure.Surprisingly, a simple string comparison algorithm applied to secondary structure performed close to the gold standard. This result supports the hypothesis that key structural information is already encoded in secondary structure and suggests that secondary structure may be a promising representation to use when high-confidence structural data is unavailable, such as in cases involving protein flexibility and disorder.
Title: Protein secondary structure and remote homology detection
Description:
1AbstractA protein can be represented by its primary, secondary, or tertiary structure.
With recent advances in AI, there is now as much tertiary as primary structural data available.
Fast and accurate search methods exist for both types of data, with searches over both representations being highly precise.
However, primary structure data can sometimes be incomplete.
As a result, tertiary structure has become the gold standard for remote homology detection.
How does secondary structure perform in remote homology detection? Secondary structure interprets proteins as a sequence using an alphabet representing helices, strands, or loops.
It shares its sequential nature with primary structure while retaining topological information similar to tertiary structure.
To assess the effectiveness of secondary structure in remote homology detection, we devised a challenging classification task aimed at determining the superfamily membership of very distantly related protein domains.
We used benchmarks from the CATH and SCOP databases and evaluated sequence and structure alignment algorithms on primary, secondary, and tertiary structures.
As expected, both basic and advanced sequence alignment algorithms applied to primary structure achieved high precision, but their overall area under the curve was lower compared to the gold standard of structural alignment using tertiary structure.
Surprisingly, a simple string comparison algorithm applied to secondary structure performed close to the gold standard.
This result supports the hypothesis that key structural information is already encoded in secondary structure and suggests that secondary structure may be a promising representation to use when high-confidence structural data is unavailable, such as in cases involving protein flexibility and disorder.

Related Results

Reflexive homology
Reflexive homology
Reflexive homology is the homology theory associated to the reflexive crossed simplicial group; one of the fundamental crossed simplicial groups. It is the most general way to exte...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Remote homology search with hidden Potts models
Remote homology search with hidden Potts models
AbstractMost methods for biological sequence homology search and alignment work with primary sequence alone, neglecting higher-order correlations. Recently, statistical physics mod...
A note on Khovanov–Rozansky sl2-homology and ordinary Khovanov homology
A note on Khovanov–Rozansky sl2-homology and ordinary Khovanov homology
In this paper we present an explicit isomorphism between Khovanov–Rozansky sl2-homology and ordinary Khovanov homology. This result was originally claimed in Khovanov and Rozansky'...
Protein Homology Modelling
Protein Homology Modelling
Abstract Protein structure prediction aims to model the three‐dimensional (3D) structure of so far structurally uncharacterised proteins from th...
Characterisation of a plant 3‐phosphoinositide‐dependent protein kinase‐1 homologue which contains a pleckstrin homology domain
Characterisation of a plant 3‐phosphoinositide‐dependent protein kinase‐1 homologue which contains a pleckstrin homology domain
A plant homologue of mammalian 3‐phosphoinositide‐dependent protein kinase‐1 (PDK1) has been identified in Arabidopsis and rice which displays 40% overall identity with human 3‐pho...
Blunt Chest Trauma and Chylothorax: A Systematic Review
Blunt Chest Trauma and Chylothorax: A Systematic Review
Abstract Introduction: Although traumatic chylothorax is predominantly associated with penetrating injuries, instances following blunt trauma, as a rare and challenging condition, ...
Comparison of Single-channel and Split-window Methods for Estimating Land Surface Temperature from Landsat 8 Data
Comparison of Single-channel and Split-window Methods for Estimating Land Surface Temperature from Landsat 8 Data
Abstract: Landsat 8 is the eighth satellite in the Landsat program, which provides images at 11 spectral channels, including 2 thermal infrared bands at a spatial resolution of 100...

Back to Top