Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Computational methods for ubiquitination site prediction using physicochemical properties of protein sequences

View through CrossRef
Abstract Background Ubiquitination is a very important process in protein post-translational modification, which has been widely investigated by biology scientists and researchers. Different experimental and computational methods have been developed to identify the ubiquitination sites in protein sequences. This paper aims at exploring computational machine learning methods for the prediction of ubiquitination sites using the physicochemical properties (PCPs) of amino acids in the protein sequences. Results We first establish six different ubiquitination data sets, whose records contain both ubiquitination sites and non-ubiquitination sites in variant numbers of protein sequence segments. In particular, to establish such data sets, protein sequence segments are extracted from the original protein sequences used in four published papers on ubiquitination, while 531 PCP features of each extracted protein sequence segment are calculated based on PCP values from AAindex (Amino Acid index database) by averaging PCP values of all amino acids on each segment. Various computational machine-learning methods, including four Bayesian network methods (i.e., Naïve Bayes (NB), Feature Selection NB (FSNB), Model Averaged NB (MANB), and Efficient Bayesian Multivariate Classifier (EBMC)) and three regression methods (i.e., Support Vector Machine (SVM), Logistic Regression (LR), and Least Absolute Shrinkage and Selection Operator (LASSO)), are then applied to the six established segment-PCP data sets. Five-fold cross-validation and the Area Under Receiver Operating Characteristic Curve (AUROC) are employed to evaluate the ubiquitination prediction performance of each method. Results demonstrate that the PCP data of protein sequences contain information that could be mined by machine learning methods for ubiquitination site prediction. The comparative results show that EBMC, SVM and LR perform better than other methods, and EBMC is the only method that can get AUCs greater than or equal to 0.6 for the six established data sets. Results also show EBMC tends to perform better for larger data. Conclusions Machine learning methods have been employed for the ubiquitination site prediction based on physicochemical properties of amino acids on protein sequences. Results demonstrate the effectiveness of using machine learning methodology to mine information from PCP data concerning protein sequences, as well as the superiority of EBMC, SVM and LR (especially EBMC) for the ubiquitination prediction compared to other methods.
Springer Science and Business Media LLC
Title: Computational methods for ubiquitination site prediction using physicochemical properties of protein sequences
Description:
Abstract Background Ubiquitination is a very important process in protein post-translational modification, which has been widely investigated by biology scientists and researchers.
Different experimental and computational methods have been developed to identify the ubiquitination sites in protein sequences.
This paper aims at exploring computational machine learning methods for the prediction of ubiquitination sites using the physicochemical properties (PCPs) of amino acids in the protein sequences.
Results We first establish six different ubiquitination data sets, whose records contain both ubiquitination sites and non-ubiquitination sites in variant numbers of protein sequence segments.
In particular, to establish such data sets, protein sequence segments are extracted from the original protein sequences used in four published papers on ubiquitination, while 531 PCP features of each extracted protein sequence segment are calculated based on PCP values from AAindex (Amino Acid index database) by averaging PCP values of all amino acids on each segment.
Various computational machine-learning methods, including four Bayesian network methods (i.
e.
, Naïve Bayes (NB), Feature Selection NB (FSNB), Model Averaged NB (MANB), and Efficient Bayesian Multivariate Classifier (EBMC)) and three regression methods (i.
e.
, Support Vector Machine (SVM), Logistic Regression (LR), and Least Absolute Shrinkage and Selection Operator (LASSO)), are then applied to the six established segment-PCP data sets.
Five-fold cross-validation and the Area Under Receiver Operating Characteristic Curve (AUROC) are employed to evaluate the ubiquitination prediction performance of each method.
Results demonstrate that the PCP data of protein sequences contain information that could be mined by machine learning methods for ubiquitination site prediction.
The comparative results show that EBMC, SVM and LR perform better than other methods, and EBMC is the only method that can get AUCs greater than or equal to 0.
6 for the six established data sets.
Results also show EBMC tends to perform better for larger data.
Conclusions Machine learning methods have been employed for the ubiquitination site prediction based on physicochemical properties of amino acids on protein sequences.
Results demonstrate the effectiveness of using machine learning methodology to mine information from PCP data concerning protein sequences, as well as the superiority of EBMC, SVM and LR (especially EBMC) for the ubiquitination prediction compared to other methods.

Related Results

Bendless-mediated K63 ubiquitination modulates cellular signalling to regulate Drosophila hematopoiesis
Bendless-mediated K63 ubiquitination modulates cellular signalling to regulate Drosophila hematopoiesis
Abstract Ubiquitination is a reversible modification whose traditional role has been associated with K48-linked poly-ubiquitination involved in proteasomal degradation....
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Cbl-mediated K63-linked ubiquitination of JAK2 enhances JAK2 phosphorylation and signal transduction
Cbl-mediated K63-linked ubiquitination of JAK2 enhances JAK2 phosphorylation and signal transduction
AbstractJAK2 activation is crucial for cytokine receptor signal transduction and leukemogenesis. However, the underlying processes that lead to full activation of JAK2 are unclear....
Gfi1 Protein Turnover Is Regulated by the Ubiquitin Ligase Triad1.
Gfi1 Protein Turnover Is Regulated by the Ubiquitin Ligase Triad1.
Abstract The transcriptional repressor Growth factor independence-1 (Gfi1) plays an essential role during various stages of hematopoiesis. It is crucial for the self...
Conventional and unconventional ubiquitination in plant immunity
Conventional and unconventional ubiquitination in plant immunity
Summary Ubiquitination is one of the most abundant types of protein post‐translational modification (PTM) in plant cells. The importance of ubiquitination in the ...
Stimulating Deubiquitination
Stimulating Deubiquitination
Ubiquitination is associated with targeting of proteins for degradation (polyubiquitination) or regulation (monoubiquitination). Chen et al. stimulated syna...
Ubiquitination of renal ENaC subunits in vivo
Ubiquitination of renal ENaC subunits in vivo
Ubiquitination of the epithelial Na+channel (ENaC) in epithelial cells may influence trafficking and hormonal regulation of the channels. We assessed ENaC ubiquitination (ub-ENaC) ...
Figs S1-S9
Figs S1-S9
Fig. S1. Consensus phylogram (50 % majority rule) resulting from a Bayesian analysis of the ITS sequence alignment of sequences generated in this study and reference sequences from...

Back to Top