Javascript must be enabled to continue!
Prediction of protein subcellular localization using deep learning and data augmentation
View through CrossRef
Abstract
Identifying subcellular localization of protein is significant for understanding its molecular function. It provides valuable insights that can be of tremendous help to protein’s function research and the detection of potential cell surface/secreted drug targets. The prediction of protein subcellular localization using bioinformatics methods is an inexpensive option to experimentally approaches. Many computational tools have been built during the past two decades, however, producing reliable prediction has always been the challenge. In this study, a Deep learning (DL) technique is proposed to enhance the precision of the analytical engine of one of these tools called PSORTb v3.0. Its conventional SVM machine learning model was replaced by the state-of-the-art DL method (BiLSTM) and a Data augmentation measure (SeqGAN). As a result, the combination of BiLSTM and SeqGAN outperformed SVM by improving its precision from 57.4% to 75%. This method was applied on a dataset containing 8230 protein sequences, which was experimentally derived by Brinkman Lab. The presented model provides promising outcomes for the future research. The source code of the model is available at
https://github.com/mgetech/SubLoc
.
Title: Prediction of protein subcellular localization using deep learning and data augmentation
Description:
Abstract
Identifying subcellular localization of protein is significant for understanding its molecular function.
It provides valuable insights that can be of tremendous help to protein’s function research and the detection of potential cell surface/secreted drug targets.
The prediction of protein subcellular localization using bioinformatics methods is an inexpensive option to experimentally approaches.
Many computational tools have been built during the past two decades, however, producing reliable prediction has always been the challenge.
In this study, a Deep learning (DL) technique is proposed to enhance the precision of the analytical engine of one of these tools called PSORTb v3.
Its conventional SVM machine learning model was replaced by the state-of-the-art DL method (BiLSTM) and a Data augmentation measure (SeqGAN).
As a result, the combination of BiLSTM and SeqGAN outperformed SVM by improving its precision from 57.
4% to 75%.
This method was applied on a dataset containing 8230 protein sequences, which was experimentally derived by Brinkman Lab.
The presented model provides promising outcomes for the future research.
The source code of the model is available at
https://github.
com/mgetech/SubLoc
.
Related Results
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features
The prediction of protein subcellular localization is critical for inferring protein functions, gene regulations and protein-protein interactions. With the advances of high-through...
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using d...
Deep generative model for protein subcellular localization prediction
Deep generative model for protein subcellular localization prediction
Abstract
Protein sequence determines not only its structure but also its subcellular localization. Although a series of artificial intelligence m...
Indoor Localization System Based on RSSI-APIT Algorithm
Indoor Localization System Based on RSSI-APIT Algorithm
An indoor localization system based on the RSSI-APIT algorithm is designed in this study. Integrated RSSI (received signal strength indication) and non-ranging APIT (approximate pe...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
The Effectiveness of Data Augmentation for Bone Suppression in Chest Radiograph using Convolutional Neural Network
The Effectiveness of Data Augmentation for Bone Suppression in Chest Radiograph using Convolutional Neural Network
Objective: Bone suppression of chest radiograph holds great promise to improve the localization accuracy in Image-Guided Radiation Therapy (IGRT). However, data scarcity has long b...
Deep generative model for protein subcellular localization prediction
Deep generative model for protein subcellular localization prediction
Abstract
Protein sequence not only determines its structure but also provides important clues of its subcellular localization. Although a series of artificial int...
Validating subcellular localization prediction tools with mycobacterial proteins
Validating subcellular localization prediction tools with mycobacterial proteins
Abstract
Background
The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation...

