Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

CATHe2: Enhanced CATH Superfamily Detection Using ProstT5 and Structural Alphabets

View through CrossRef
AbstractMotivationThe CATH database is a free publicly available online resource that provides annotations about the evolutionary and structural relationships of protein domains. Due to the flux of protein structures coming mainly from the recent breakthrough of AlphaFold and therefore the non-feasibility of manual intervention, the CATH team recently developed an automatic CATH superfamily classifier called CATHe, that uses a feed-forward network classifier with protein Language Model (pLM) embeddings as input. Using the same dataset, in this paper, we present, CATHe2 that improves on CATHe by switching the old pLM ProtT5 for one of the most recent versions called ProstT5, and by introducing domain 3D information as input to the classifier, in the form of Structural Alphabet representation, namely 3Di sequence embeddings. Finally, CATHe2 implements a new version of the feed-forward network (FNN, i.e, non-recurrent neural network) classifier architecture, fine-tuned to perform at the CATH superfamily prediction task.ResultsThe best CATHe2 model reaches an accuracy of 92.2 ± 0.7% with an F1 score of 82.3 ± 1.3% which constitutes an improvement of 9.9% on the F1 score and 6.6% on the accuracy, from the previous CATHe version (85.6 ± 0.4% accuracy and 72.4 ± 0.7% F1 score) on its largest dataset (~ 1700 superfamilies). This model uses ProstT5 AA sequence and 3Di sequence embeddings as input to the classifier, but a simplified version requiring only AA sequences, already improves CATHe’s F1 score by 6.7 ± 1.3% and accuracy by 6.6 ± 0.7% on its largest dataset.Availability & ImplementationThe code is available onhttps://GitHub.com/Mouret-Orfeu/CATHe2. Datasets:https://doi.org/10.5281/zenodo.14534966Contactorfeu.mouret.pro@outlook.fr,j.abbass@kingston.ac.uk
Cold Spring Harbor Laboratory
Title: CATHe2: Enhanced CATH Superfamily Detection Using ProstT5 and Structural Alphabets
Description:
AbstractMotivationThe CATH database is a free publicly available online resource that provides annotations about the evolutionary and structural relationships of protein domains.
Due to the flux of protein structures coming mainly from the recent breakthrough of AlphaFold and therefore the non-feasibility of manual intervention, the CATH team recently developed an automatic CATH superfamily classifier called CATHe, that uses a feed-forward network classifier with protein Language Model (pLM) embeddings as input.
Using the same dataset, in this paper, we present, CATHe2 that improves on CATHe by switching the old pLM ProtT5 for one of the most recent versions called ProstT5, and by introducing domain 3D information as input to the classifier, in the form of Structural Alphabet representation, namely 3Di sequence embeddings.
Finally, CATHe2 implements a new version of the feed-forward network (FNN, i.
e, non-recurrent neural network) classifier architecture, fine-tuned to perform at the CATH superfamily prediction task.
ResultsThe best CATHe2 model reaches an accuracy of 92.
2 ± 0.
7% with an F1 score of 82.
3 ± 1.
3% which constitutes an improvement of 9.
9% on the F1 score and 6.
6% on the accuracy, from the previous CATHe version (85.
6 ± 0.
4% accuracy and 72.
4 ± 0.
7% F1 score) on its largest dataset (~ 1700 superfamilies).
This model uses ProstT5 AA sequence and 3Di sequence embeddings as input to the classifier, but a simplified version requiring only AA sequences, already improves CATHe’s F1 score by 6.
7 ± 1.
3% and accuracy by 6.
6 ± 0.
7% on its largest dataset.
Availability & ImplementationThe code is available onhttps://GitHub.
com/Mouret-Orfeu/CATHe2.
Datasets:https://doi.
org/10.
5281/zenodo.
14534966Contactorfeu.
mouret.
pro@outlook.
fr,j.
abbass@kingston.
ac.
uk.

Related Results

Cath Raby in conversation with Jen Webb on research higher degree examination administration
Cath Raby in conversation with Jen Webb on research higher degree examination administration
This paper is the edited transcript of a conversation between Cath Raby (Research Students Office) and Jen Webb about an administrator’s perspective of the process of examining cre...
Essential Managerial Skills in the Cath Lab
Essential Managerial Skills in the Cath Lab
This article outlines some essential management skills needed in the cath lab today from the perspective of an experienced senior leader of a multiple hospital cath lab environment...
Leadership Skills in the Cath Lab
Leadership Skills in the Cath Lab
This article outlines some basic leadership skills needed in the cath lab today from the perspective of a newly appointed cath lab supervisor. Whether you are interested in advanci...
Sur l’origine de l’écriture libyque. Quelques propositions
Sur l’origine de l’écriture libyque. Quelques propositions
Le présent article propose quelques hypothèses sur l’origine des alphabets dits libyques. Attestés par plus d’un millier d’inscriptions recueillies dans toute l’Afrique du Nord, de...
P42 ACUTE STENT THROMBOSIS IN CATH LAB DURING OPTIMIZED PCI: HOW TO RECOGNIZE IT AND POSSIBLE TREATMENT
P42 ACUTE STENT THROMBOSIS IN CATH LAB DURING OPTIMIZED PCI: HOW TO RECOGNIZE IT AND POSSIBLE TREATMENT
Abstract Angiographic Acute Stent Thrombosis occur in 0.5%–2% of procedures.OCT–PCI or Optimized PCI represents a new approach particularly indicated in complex high...
Nursing Perspective of Transcatheter Aortic Valve Implantation
Nursing Perspective of Transcatheter Aortic Valve Implantation
Background. With an increasingly elderly population, we are seeing more patients with multiple comorbidities unable to have surgical repair of their aortic valves. Transcatheter ao...
Advanced frameworks for fraud detection leveraging quantum machine learning and data science in fintech ecosystems
Advanced frameworks for fraud detection leveraging quantum machine learning and data science in fintech ecosystems
The rapid expansion of the fintech sector has brought with it an increasing demand for robust and sophisticated fraud detection systems capable of managing large volumes of financi...

Back to Top