Javascript must be enabled to continue!
IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models
View through CrossRef
Background
A moonlighting protein refers to a protein that can perform two or more functions. Since the current moonlighting protein prediction tools mainly focus on the proteins in animals and microorganisms, and there are differences in the cells and proteins between animals and plants, these may cause the existing tools to predict plant moonlighting proteins inaccurately. Hence, the availability of a benchmark data set and a prediction tool specific for plant moonlighting protein are necessary.
Methods
This study used some protein feature classes from the data set constructed in house to develop a web-based prediction tool. In the beginning, we built a data set about plant protein and reduced redundant sequences. We then performed feature selection, feature normalization and feature dimensionality reduction on the training data. Next, machine learning methods for preliminary modeling were used to select feature classes that performed best in plant moonlighting protein prediction. This selected feature was incorporated into the final plant protein prediction tool. After that, we compared five machine learning methods and used grid searching to optimize parameters, and the most suitable method was chosen as the final model.
Results
The prediction results indicated that the eXtreme Gradient Boosting (XGBoost) performed best, which was used as the algorithm to construct the prediction tool, called IdentPMP (Identification of Plant Moonlighting Proteins). The results of the independent test set shows that the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUC) of IdentPMP is 0.43 and 0.68, which are 19.44% (0.43 vs. 0.36) and 13.33% (0.68 vs. 0.60) higher than state-of-the-art non-plant specific methods, respectively. This further demonstrated that a benchmark data set and a plant-specific prediction tool was required for plant moonlighting protein studies. Finally, we implemented the tool into a web version, and users can use it freely through the URL: http://identpmp.aielab.net/.
Title: IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models
Description:
Background
A moonlighting protein refers to a protein that can perform two or more functions.
Since the current moonlighting protein prediction tools mainly focus on the proteins in animals and microorganisms, and there are differences in the cells and proteins between animals and plants, these may cause the existing tools to predict plant moonlighting proteins inaccurately.
Hence, the availability of a benchmark data set and a prediction tool specific for plant moonlighting protein are necessary.
Methods
This study used some protein feature classes from the data set constructed in house to develop a web-based prediction tool.
In the beginning, we built a data set about plant protein and reduced redundant sequences.
We then performed feature selection, feature normalization and feature dimensionality reduction on the training data.
Next, machine learning methods for preliminary modeling were used to select feature classes that performed best in plant moonlighting protein prediction.
This selected feature was incorporated into the final plant protein prediction tool.
After that, we compared five machine learning methods and used grid searching to optimize parameters, and the most suitable method was chosen as the final model.
Results
The prediction results indicated that the eXtreme Gradient Boosting (XGBoost) performed best, which was used as the algorithm to construct the prediction tool, called IdentPMP (Identification of Plant Moonlighting Proteins).
The results of the independent test set shows that the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUC) of IdentPMP is 0.
43 and 0.
68, which are 19.
44% (0.
43 vs.
0.
36) and 13.
33% (0.
68 vs.
0.
60) higher than state-of-the-art non-plant specific methods, respectively.
This further demonstrated that a benchmark data set and a plant-specific prediction tool was required for plant moonlighting protein studies.
Finally, we implemented the tool into a web version, and users can use it freely through the URL: http://identpmp.
aielab.
net/.
Related Results
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Beyond the 9-to-5: Unravelling the Pros and Cons of Moonlighting in Today’s Job Market
Beyond the 9-to-5: Unravelling the Pros and Cons of Moonlighting in Today’s Job Market
In current years, moonlighting, defined as the practice of performing multiple tasks simultaneously, has gained prominence as an integral part of the evolving business environment....
The Dynamics of Moonlighting in Pakistan
The Dynamics of Moonlighting in Pakistan
The study explores the dynamics of moonlighting, demographics, human capital and association of occupations between primary and secondary job. The paper is based on cross-section d...
Identification of heparin‐binding proteins in bovine seminal plasma
Identification of heparin‐binding proteins in bovine seminal plasma
AbstractA group of four similar proteins, BSP‐A1, BSP‐A2, BSP‐A3, and BSP‐30‐kDa, represent the major acidic proteins found in bovine seminal plasma (BSP). These proteins are secre...
Moonlighting, Harm? : Student Perception of the Effect of Moonligthing, Achievement Motivation on Lecturer Performance
Moonlighting, Harm? : Student Perception of the Effect of Moonligthing, Achievement Motivation on Lecturer Performance
This study aims to determine students 'perceptions of the term moonlighting lecturers and find out the effect of moonlinghting on lecturers' performance which is mediated by achiev...
Ethnobotanical profiles of wild edible plants recorded from Mongolia by Yunatov during 1940–1951
Ethnobotanical profiles of wild edible plants recorded from Mongolia by Yunatov during 1940–1951
AbstractMongolian traditional botanical knowledge has been rarely researched concerning the ethnobotany theory and methodology in the last six decades (Pei in Acta Botanica Yunnani...
Initial Experience with Pediatrics Online Learning for Nonclinical Medical Students During the COVID-19 Pandemic
Initial Experience with Pediatrics Online Learning for Nonclinical Medical Students During the COVID-19 Pandemic
Abstract
Background: To minimize the risk of infection during the COVID-19 pandemic, the learning mode of universities in China has been adjusted, and the online learning o...
Effect of Learning Management Using Problem-based Learning on Fine Arts Basic Ability of Freshmen in Suzhou Arts and Design Institute, The People’s Republic of China
Effect of Learning Management Using Problem-based Learning on Fine Arts Basic Ability of Freshmen in Suzhou Arts and Design Institute, The People’s Republic of China
Background and Aim: Learning Management Using Problem-Based Learning students can have better development of creativity, the ability to apply in real-world situations, aesthetic ap...

