Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

DeEPsnap: human essential gene prediction by integrating multi-omics data

View through CrossRef
Abstract Essential genes are necessary for the survival or reproduction of a living organism. The prediction and analysis of gene essentiality can advance our understanding of basic life and human diseases, and further boost the development of new drugs. Wet lab methods for identifying cell essential genes are often costly, time-consuming, and laborious. As a complement, computational methods have been proposed to predict essential genes by integrating multiple biological data sources. Most of these methods are evaluated on model organisms. However, prediction methods for human essential genes are still limited and the relationship between human gene essentiality and different biological information still needs to be explored. In addition, exploring suitable deep learning techniques to overcome the limitations of traditional machine learning methods and improve prediction accuracy is also important and interesting. We propose a snapshot ensemble deep neural network method, DeEPsnap, to predict human essential genes. DeEPsnap integrates sequence features derived from DNA and protein sequence data with features extracted or learned from multiple types of functional data, such as gene ontology, protein complex, protein domain, and protein-protein interaction network. More than 200 features from these biological data are extracted/learned which are integrated together to train a series of cost-sensitive deep neural networks by utilizing multiple deep learning techniques. The proposed snapshot mechanism enables us to train multiple models without increasing extra training effort and cost. The experimental results of 10-fold cross-validation show that DeEPsnap can accurately predict human gene essentiality with an average AUROC (Area Under the Receiver Operating Characteristic curve) of 96.1%, the average AUPRC (Area under the Precision-Recall curve) of 93.82%, the average accuracy of 92.21%, and the average F1 measure about 80.62%. In addition, the comparison of experimental results shows that DeEPsnap outperforms several popular traditional machine learning models and deep learning models. We have demonstrated that the proposed method, DeEPsnap, is effective for predicting human essential genes.
Springer Science and Business Media LLC
Title: DeEPsnap: human essential gene prediction by integrating multi-omics data
Description:
Abstract Essential genes are necessary for the survival or reproduction of a living organism.
The prediction and analysis of gene essentiality can advance our understanding of basic life and human diseases, and further boost the development of new drugs.
Wet lab methods for identifying cell essential genes are often costly, time-consuming, and laborious.
As a complement, computational methods have been proposed to predict essential genes by integrating multiple biological data sources.
Most of these methods are evaluated on model organisms.
However, prediction methods for human essential genes are still limited and the relationship between human gene essentiality and different biological information still needs to be explored.
In addition, exploring suitable deep learning techniques to overcome the limitations of traditional machine learning methods and improve prediction accuracy is also important and interesting.
We propose a snapshot ensemble deep neural network method, DeEPsnap, to predict human essential genes.
DeEPsnap integrates sequence features derived from DNA and protein sequence data with features extracted or learned from multiple types of functional data, such as gene ontology, protein complex, protein domain, and protein-protein interaction network.
More than 200 features from these biological data are extracted/learned which are integrated together to train a series of cost-sensitive deep neural networks by utilizing multiple deep learning techniques.
The proposed snapshot mechanism enables us to train multiple models without increasing extra training effort and cost.
The experimental results of 10-fold cross-validation show that DeEPsnap can accurately predict human gene essentiality with an average AUROC (Area Under the Receiver Operating Characteristic curve) of 96.
1%, the average AUPRC (Area under the Precision-Recall curve) of 93.
82%, the average accuracy of 92.
21%, and the average F1 measure about 80.
62%.
In addition, the comparison of experimental results shows that DeEPsnap outperforms several popular traditional machine learning models and deep learning models.
We have demonstrated that the proposed method, DeEPsnap, is effective for predicting human essential genes.

Related Results

DeEPsnap: human essential gene prediction by integrating multi-omics data
DeEPsnap: human essential gene prediction by integrating multi-omics data
AbstractEssential genes are necessary for the survival or reproduction of a living organism. The prediction and analysis of gene essentiality can advance our understanding of basic...
Benchmarking multi-omics integrative clustering methods for subtype identification in colorectal cancer
Benchmarking multi-omics integrative clustering methods for subtype identification in colorectal cancer
Abstract Background and objectives Colorectal cancer (CRC) represents a heterogeneous malignancy that has concerned global burden of incidence and mortality. The tradition...
Muon: multimodal omics analysis framework
Muon: multimodal omics analysis framework
AbstractAdvances in multi-omics technologies have led to an explosion of multimodal datasets to address questions ranging from basic biology to translation. While these rich data p...
Multi-omics Data Integration by Generative Adversarial Network
Multi-omics Data Integration by Generative Adversarial Network
Accurate disease phenotype prediction plays an important role in the treatment of heterogeneous diseases like cancer in the era of precision medicine. With the advent of high throu...
Expression and polymorphism of genes in gallstones
Expression and polymorphism of genes in gallstones
ABSTRACT Through the method of clinical case control study, to explore the expression and genetic polymorphism of KLF14 gene (rs4731702 and rs972283) and SR-B1 gene (rs...
Exploring the classification of cancer cell lines from multiple omic views
Exploring the classification of cancer cell lines from multiple omic views
Background Cancer classification is of great importance to understanding its pathogenesis, making diagnosis and developing treatment. The accumulation of extensive o...
A benchmark study of deep learning-based multi-omics data fusion methods for cancer
A benchmark study of deep learning-based multi-omics data fusion methods for cancer
Abstract Background A fused method using a combination of multi-omics data enables a comprehensive study of complex biological processes and highlig...

Back to Top