Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Protein-peptide Interaction Representation Learning with Pretrained Language Models

View through CrossRef
Abstract Protein-peptide Interactions (PpIs) paly essential roles in diverse cellular processes, yet their systematic identification remains challenging due to the limited availability of experimentally annotated protein-peptide interaction data. To address this challenge, we present PepInter, a sequence-based Deep Learning (DL) framework that leverages large-scale pretraining on structurally derived pseudo protein-peptide pairs to learn interaction-relevant representations. Specifically, energy-dominant peptide fragments are extracted from protein complexes curated from non-redundant Protein Data Bank (PDB) structures, enabling the construction of pseudo protein-peptide interaction pairs that capture interface interaction patterns shared with canonical protein-protein interactions. This strategy allows the model to acquire interaction-aware priors in the absence of large-scale annotated protein-peptide complex datasets. Built upon the ESM-Cambrian (ESMC) architecture, PepInter adopts a two-stage pretraining strategy. In the first stage, masked language modeling is used to learn general protein sequence representations. In the second stage, the model is further trained to predict Rosetta-derived energetic scores, explicitly incorporating structural interaction signals into the learned embeddings. Following pretraining, PepInter is fine-tuned for both protein-peptide interaction classification and peptide bioactivity regression tasks. Across multiple benchmark datasets, including protein-peptide binding affinity prediction, PepInter consistently outperforms existing baseline methods and demonstrates strong generalization in identifying biologically meaningful PpIs. Case studies further highlight its ability to recover known interaction patterns and predict novel protein-peptide interactions. Together, these results establish PepInter as a scalable and effective framework for protein-peptide interaction prediction, with strong potential to accelerate peptide-based drug discovery.
Title: Protein-peptide Interaction Representation Learning with Pretrained Language Models
Description:
Abstract Protein-peptide Interactions (PpIs) paly essential roles in diverse cellular processes, yet their systematic identification remains challenging due to the limited availability of experimentally annotated protein-peptide interaction data.
To address this challenge, we present PepInter, a sequence-based Deep Learning (DL) framework that leverages large-scale pretraining on structurally derived pseudo protein-peptide pairs to learn interaction-relevant representations.
Specifically, energy-dominant peptide fragments are extracted from protein complexes curated from non-redundant Protein Data Bank (PDB) structures, enabling the construction of pseudo protein-peptide interaction pairs that capture interface interaction patterns shared with canonical protein-protein interactions.
This strategy allows the model to acquire interaction-aware priors in the absence of large-scale annotated protein-peptide complex datasets.
Built upon the ESM-Cambrian (ESMC) architecture, PepInter adopts a two-stage pretraining strategy.
In the first stage, masked language modeling is used to learn general protein sequence representations.
In the second stage, the model is further trained to predict Rosetta-derived energetic scores, explicitly incorporating structural interaction signals into the learned embeddings.
Following pretraining, PepInter is fine-tuned for both protein-peptide interaction classification and peptide bioactivity regression tasks.
Across multiple benchmark datasets, including protein-peptide binding affinity prediction, PepInter consistently outperforms existing baseline methods and demonstrates strong generalization in identifying biologically meaningful PpIs.
Case studies further highlight its ability to recover known interaction patterns and predict novel protein-peptide interactions.
Together, these results establish PepInter as a scalable and effective framework for protein-peptide interaction prediction, with strong potential to accelerate peptide-based drug discovery.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Anemia Is Inversely Associated with Serum C-Peptide Concentrations in Patients with Type 2 Diabetes
Anemia Is Inversely Associated with Serum C-Peptide Concentrations in Patients with Type 2 Diabetes
Results: The aim of the study was to investigate the relationship between anemia and serum C-peptide concentrations in Korean patients with type 2 diabetes. A total of 1,300 subjec...
Modulating Protein-Protein Interactions via Peptide-Based Inhibitors: Structural and Functional Insights
Modulating Protein-Protein Interactions via Peptide-Based Inhibitors: Structural and Functional Insights
As potential therapeutic targets, protein-protein interactions (PPI) are primary to cellular function and processes. This thesis explores peptide-based PPI inhibitors with respect ...

Back to Top