Javascript must be enabled to continue!
TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
View through CrossRef
In bioinformatics, modeling the protein space to better predict function
and structure has benefitted from Protein Language Models (PLMs). Their
basis is the protein’s amino acid sequence and self-supervised learning.
Ankh is a prime example of such a PLM. While there has been some recent
work on integrating structure with a PLM to enhance predictive
performance, to date there has been no work on integrating secondary
structure rather than three-dimensional structure. Here we present
TooT-PLM-P2S that begins with the Ankh model pre-trained on 45 million
proteins using self-supervised learning. TooT-PLM-P2S builds upon the
Ankh model by initially using its pre-trained encoder and decoder. It
then undergoes an additional training phase with approximately 10,000
proteins and their corresponding secondary structures. This retraining
process modifies the encoder and decoder, resulting in the creation of
TooT-PLM-P2S. We then assess the impact of integrating secondary
structure information into the Ankh model by comparing Ankh and
TooT-PLM-P2S on eight downstream tasks including fluorescence and
solubility prediction, sub-cellular localization, and membrane protein
classification. For both Ankh and TooT-PLM-P2S the downstream tasks
required task-specific training. Few of the results showed statistically
significant differences. Ankh outperformed on three of the eight tasks,
TooT-PLM-P2S did not outperform on any task for the primary metric.
TooT-PLM-P2S did outperform for the precision metric for the task of
discriminating membrane proteins from non-membrane proteins. This study
requires future work with expanded datasets and refined integration
methods.
Title: TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
Description:
In bioinformatics, modeling the protein space to better predict function
and structure has benefitted from Protein Language Models (PLMs).
Their
basis is the protein’s amino acid sequence and self-supervised learning.
Ankh is a prime example of such a PLM.
While there has been some recent
work on integrating structure with a PLM to enhance predictive
performance, to date there has been no work on integrating secondary
structure rather than three-dimensional structure.
Here we present
TooT-PLM-P2S that begins with the Ankh model pre-trained on 45 million
proteins using self-supervised learning.
TooT-PLM-P2S builds upon the
Ankh model by initially using its pre-trained encoder and decoder.
It
then undergoes an additional training phase with approximately 10,000
proteins and their corresponding secondary structures.
This retraining
process modifies the encoder and decoder, resulting in the creation of
TooT-PLM-P2S.
We then assess the impact of integrating secondary
structure information into the Ankh model by comparing Ankh and
TooT-PLM-P2S on eight downstream tasks including fluorescence and
solubility prediction, sub-cellular localization, and membrane protein
classification.
For both Ankh and TooT-PLM-P2S the downstream tasks
required task-specific training.
Few of the results showed statistically
significant differences.
Ankh outperformed on three of the eight tasks,
TooT-PLM-P2S did not outperform on any task for the primary metric.
TooT-PLM-P2S did outperform for the precision metric for the task of
discriminating membrane proteins from non-membrane proteins.
This study
requires future work with expanded datasets and refined integration
methods.
Related Results
TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
Abstract
In bioinformatics, modeling the protein space to better predict function and structure has benefitted from Protein Language Models (PLMs). Their basis is t...
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Understanding product lifecycle management and supporting systems
Understanding product lifecycle management and supporting systems
PurposeThis study attempts to expand knowledge of product lifecycle management (PLM) and supporting systems. Its objective is threefold: first, to identify and assess the impact of...
Exploiting protein language models for the precise classification of ion channels and ion transporters
Exploiting protein language models for the precise classification of ion channels and ion transporters
Abstract
This study introduces TooT‐PLM‐ionCT, a comprehensive framework that consolidates three distinct systems, each meticulously tailored...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Exploiting protein language models for the precise classification of ion channels and ion transporters
Exploiting protein language models for the precise classification of ion channels and ion transporters
Abstract
This study presents TooT-PLM-ionCT, a composite framework consisting of three distinct systems, each with different architectures and trained on unique dat...
The Requirements of Product Lifecycle Management (PLM) frameworks for integration and synergic collaboration with Omnichannel strategy
The Requirements of Product Lifecycle Management (PLM) frameworks for integration and synergic collaboration with Omnichannel strategy
The Importance of Product Lifecycle Management (PLM) is inevitable in fulfilling the collaboration and integration of different disciplines engaged in product development processes...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...

