Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Protein-peptide Interaction Representation Learning with Pretrained Language Models

View through CrossRef
Abstract Protein-peptide Interactions (PpIs) paly essential roles in diverse cellular processes, yet their systematic identification remains challenging due to the limited availability of experimentally annotated protein-peptide interaction data. To address this challenge, we present PepInter, a sequence-based Deep Learning (DL) framework that leverages large-scale pretraining on structurally derived pseudo protein-peptide pairs to learn interaction-relevant representations. Specifically, energy-dominant peptide fragments are extracted from protein complexes curated from non-redundant Protein Data Bank (PDB) structures, enabling the construction of pseudo protein-peptide interaction pairs that capture interface interaction patterns shared with canonical protein-protein interactions. This strategy allows the model to acquire interaction-aware priors in the absence of large-scale annotated protein-peptide complex datasets. Built upon the ESM-Cambrian (ESMC) architecture, PepInter adopts a two-stage pretraining strategy. In the first stage, masked language modeling is used to learn general protein sequence representations. In the second stage, the model is further trained to predict Rosetta-derived energetic scores, explicitly incorporating structural interaction signals into the learned embeddings. Following pretraining, PepInter is fine-tuned for both protein-peptide interaction classification and peptide bioactivity regression tasks. Across multiple benchmark datasets, including protein-peptide binding affinity prediction, PepInter consistently outperforms existing baseline methods and demonstrates strong generalization in identifying biologically meaningful PpIs. Case studies further highlight its ability to recover known interaction patterns and predict novel protein-peptide interactions. Together, these results establish PepInter as a scalable and effective framework for protein-peptide interaction prediction, with strong potential to accelerate peptide-based drug discovery.
Title: Protein-peptide Interaction Representation Learning with Pretrained Language Models
Description:
Abstract Protein-peptide Interactions (PpIs) paly essential roles in diverse cellular processes, yet their systematic identification remains challenging due to the limited availability of experimentally annotated protein-peptide interaction data.
To address this challenge, we present PepInter, a sequence-based Deep Learning (DL) framework that leverages large-scale pretraining on structurally derived pseudo protein-peptide pairs to learn interaction-relevant representations.
Specifically, energy-dominant peptide fragments are extracted from protein complexes curated from non-redundant Protein Data Bank (PDB) structures, enabling the construction of pseudo protein-peptide interaction pairs that capture interface interaction patterns shared with canonical protein-protein interactions.
This strategy allows the model to acquire interaction-aware priors in the absence of large-scale annotated protein-peptide complex datasets.
Built upon the ESM-Cambrian (ESMC) architecture, PepInter adopts a two-stage pretraining strategy.
In the first stage, masked language modeling is used to learn general protein sequence representations.
In the second stage, the model is further trained to predict Rosetta-derived energetic scores, explicitly incorporating structural interaction signals into the learned embeddings.
Following pretraining, PepInter is fine-tuned for both protein-peptide interaction classification and peptide bioactivity regression tasks.
Across multiple benchmark datasets, including protein-peptide binding affinity prediction, PepInter consistently outperforms existing baseline methods and demonstrates strong generalization in identifying biologically meaningful PpIs.
Case studies further highlight its ability to recover known interaction patterns and predict novel protein-peptide interactions.
Together, these results establish PepInter as a scalable and effective framework for protein-peptide interaction prediction, with strong potential to accelerate peptide-based drug discovery.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Anemia Is Inversely Associated with Serum C-Peptide Concentrations in Patients with Type 2 Diabetes
Anemia Is Inversely Associated with Serum C-Peptide Concentrations in Patients with Type 2 Diabetes
Results: The aim of the study was to investigate the relationship between anemia and serum C-peptide concentrations in Korean patients with type 2 diabetes. A total of 1,300 subjec...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...

Back to Top