Javascript must be enabled to continue!
Contrastive learning of protein representations with graph neural networks for structural and functional annotations
View through CrossRef
Although protein sequence data is growing at an ever-increasing rate, the protein universe is still sparsely annotated with functional and structural annotations. Computational approaches have become efficient solutions to infer annotations for unlabeled proteins by transferring knowledge from proteins with experimental annotations. Despite the increasing availability of protein structure data and the high coverage of high-quality predicted structures, e.g., by AlphaFold, many existing computational tools still only rely on sequence data to predict structural or functional annotations, including alignment algorithms such as BLAST and several sequence-based deep learning models. Here, we develop PenLight, a general deep learning framework for protein structural and functional annotations. PenLight uses a graph neural network (GNN) to integrate 3D protein structure data and protein language model representations. In addition, PenLight applies a contrastive learning strategy to train the GNN for learning protein representations that reflect similarities beyond sequence identity, such as semantic similarities in the function or structure space. We benchmarked PenLight on a structural classification task and a functional annotation task, where PenLight achieved higher prediction accuracy and coverage than state-of-the-art methods.
Title: Contrastive learning of protein representations with graph neural networks for structural and functional annotations
Description:
Although protein sequence data is growing at an ever-increasing rate, the protein universe is still sparsely annotated with functional and structural annotations.
Computational approaches have become efficient solutions to infer annotations for unlabeled proteins by transferring knowledge from proteins with experimental annotations.
Despite the increasing availability of protein structure data and the high coverage of high-quality predicted structures, e.
g.
, by AlphaFold, many existing computational tools still only rely on sequence data to predict structural or functional annotations, including alignment algorithms such as BLAST and several sequence-based deep learning models.
Here, we develop PenLight, a general deep learning framework for protein structural and functional annotations.
PenLight uses a graph neural network (GNN) to integrate 3D protein structure data and protein language model representations.
In addition, PenLight applies a contrastive learning strategy to train the GNN for learning protein representations that reflect similarities beyond sequence identity, such as semantic similarities in the function or structure space.
We benchmarked PenLight on a structural classification task and a functional annotation task, where PenLight achieved higher prediction accuracy and coverage than state-of-the-art methods.
Related Results
Gene function finding through cross-organism ensemble learning
Gene function finding through cross-organism ensemble learning
Abstract
Background
Structured biological information about genes and proteins is a valuable resource to improve discovery and understanding of comp...
Protein Fold Classification using Graph Neural Network and Protein Topology Graph
Protein Fold Classification using Graph Neural Network and Protein Topology Graph
AbstractProtein fold classification reveals key structural information about proteins that is essential for understanding their function. While numerous approaches exist in the lit...
Drug–target affinity prediction with extended graph learning-convolutional networks
Drug–target affinity prediction with extended graph learning-convolutional networks
Abstract
Background
High-performance computing plays a pivotal role in computer-aided drug design, a field that holds significant promise in pharmac...
On the role of network dynamics for information processing in artificial and biological neural networks
On the role of network dynamics for information processing in artificial and biological neural networks
Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...
Meta-Representations as Representations of Processes
Meta-Representations as Representations of Processes
In this study, we explore how the notion of meta-representations in Higher-Order Theories (HOT) of consciousness can be implemented in computational models. HOT suggests that consc...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
Self-Supervised Heterogeneous Graph Neural Network with Multi-Scale Meta-Path Contrastive Learning
Self-Supervised Heterogeneous Graph Neural Network with Multi-Scale Meta-Path Contrastive Learning
Abstract
Heterogeneous graph neural networks (HGNNs) exhibit remarkable capabilities in modeling complex structures and multi-semantic information. However, existing method...

