Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models

View through CrossRef
In bioinformatics, modeling the protein space to better predict function and structure has benefitted from Protein Language Models (PLMs). Their basis is the protein’s amino acid sequence and self-supervised learning. Ankh is a prime example of such a PLM. While there has been some recent work on integrating structure with a PLM to enhance predictive performance, to date there has been no work on integrating secondary structure rather than three-dimensional structure. Here we present TooT-PLM-P2S that begins with the Ankh model pre-trained on 45 million proteins using self-supervised learning. TooT-PLM-P2S builds upon the Ankh model by initially using its pre-trained encoder and decoder. It then undergoes an additional training phase with approximately 10,000 proteins and their corresponding secondary structures. This retraining process modifies the encoder and decoder, resulting in the creation of TooT-PLM-P2S. We then assess the impact of integrating secondary structure information into the Ankh model by comparing Ankh and TooT-PLM-P2S on eight downstream tasks including fluorescence and solubility prediction, sub-cellular localization, and membrane protein classification. For both Ankh and TooT-PLM-P2S the downstream tasks required task-specific training. Few of the results showed statistically significant differences. Ankh outperformed on three of the eight tasks, TooT-PLM-P2S did not outperform on any task for the primary metric. TooT-PLM-P2S did outperform for the precision metric for the task of discriminating membrane proteins from non-membrane proteins. This study requires future work with expanded datasets and refined integration methods.
Title: TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
Description:
In bioinformatics, modeling the protein space to better predict function and structure has benefitted from Protein Language Models (PLMs).
Their basis is the protein’s amino acid sequence and self-supervised learning.
Ankh is a prime example of such a PLM.
While there has been some recent work on integrating structure with a PLM to enhance predictive performance, to date there has been no work on integrating secondary structure rather than three-dimensional structure.
Here we present TooT-PLM-P2S that begins with the Ankh model pre-trained on 45 million proteins using self-supervised learning.
TooT-PLM-P2S builds upon the Ankh model by initially using its pre-trained encoder and decoder.
It then undergoes an additional training phase with approximately 10,000 proteins and their corresponding secondary structures.
This retraining process modifies the encoder and decoder, resulting in the creation of TooT-PLM-P2S.
We then assess the impact of integrating secondary structure information into the Ankh model by comparing Ankh and TooT-PLM-P2S on eight downstream tasks including fluorescence and solubility prediction, sub-cellular localization, and membrane protein classification.
For both Ankh and TooT-PLM-P2S the downstream tasks required task-specific training.
Few of the results showed statistically significant differences.
Ankh outperformed on three of the eight tasks, TooT-PLM-P2S did not outperform on any task for the primary metric.
TooT-PLM-P2S did outperform for the precision metric for the task of discriminating membrane proteins from non-membrane proteins.
This study requires future work with expanded datasets and refined integration methods.

Related Results

TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
TooT-PLM-P2S: Incorporating Secondary Structure Information into Protein Language Models
Abstract In bioinformatics, modeling the protein space to better predict function and structure has benefitted from Protein Language Models (PLMs). Their basis is t...
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Understanding product lifecycle management and supporting systems
Understanding product lifecycle management and supporting systems
PurposeThis study attempts to expand knowledge of product lifecycle management (PLM) and supporting systems. Its objective is threefold: first, to identify and assess the impact of...
Exploiting protein language models for the precise classification of ion channels and ion transporters
Exploiting protein language models for the precise classification of ion channels and ion transporters
Abstract This study introduces TooT‐PLM‐ionCT, a comprehensive framework that consolidates three distinct systems, each meticulously tailored...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Exploiting protein language models for the precise classification of ion channels and ion transporters
Exploiting protein language models for the precise classification of ion channels and ion transporters
Abstract This study presents TooT-PLM-ionCT, a composite framework consisting of three distinct systems, each with different architectures and trained on unique dat...
The Requirements of Product Lifecycle Management (PLM) frameworks for integration and synergic collaboration with Omnichannel strategy
The Requirements of Product Lifecycle Management (PLM) frameworks for integration and synergic collaboration with Omnichannel strategy
The Importance of Product Lifecycle Management (PLM) is inevitable in fulfilling the collaboration and integration of different disciplines engaged in product development processes...

Back to Top