Javascript must be enabled to continue!
Non-Homology-Based Prediction of Gene Functions
View through CrossRef
Abstract
Advances in genome sequencing and annotation have eased the difficulty of identifying new gene sequences. Predicting the functions of these newly identified genes remains challenging. Genes descended from a common ancestral sequence are likely to have common functions. As a result homology is widely used for gene function prediction. This means functional annotation errors also propagate from one species to another. Several approaches based on machine learning classification algorithms were evaluated for their ability to accurately predict gene function from non-homology gene features. Among the eight supervised classification algorithms evaluated, random forest-based prediction consistently provided the most accurate gene function prediction. Non-homology-based functional annotation provides complementary strengths to homology-based annotation, with higher average performance in Biological Process GO terms, the domain where homology-based functional annotation performs the worst, and weaker performance in Molecular Function GO terms, the domain where the accuracy of homology-based functional annotation is highest. Non-homology-based functional annotation based on machine learning may ultimately prove useful both as a method to assign predicted functions to orphan genes which lack functionally characterized homologs, and to identify and correct functional annotation errors which were propagated through homology-based functional annotations.
Title: Non-Homology-Based Prediction of Gene Functions
Description:
Abstract
Advances in genome sequencing and annotation have eased the difficulty of identifying new gene sequences.
Predicting the functions of these newly identified genes remains challenging.
Genes descended from a common ancestral sequence are likely to have common functions.
As a result homology is widely used for gene function prediction.
This means functional annotation errors also propagate from one species to another.
Several approaches based on machine learning classification algorithms were evaluated for their ability to accurately predict gene function from non-homology gene features.
Among the eight supervised classification algorithms evaluated, random forest-based prediction consistently provided the most accurate gene function prediction.
Non-homology-based functional annotation provides complementary strengths to homology-based annotation, with higher average performance in Biological Process GO terms, the domain where homology-based functional annotation performs the worst, and weaker performance in Molecular Function GO terms, the domain where the accuracy of homology-based functional annotation is highest.
Non-homology-based functional annotation based on machine learning may ultimately prove useful both as a method to assign predicted functions to orphan genes which lack functionally characterized homologs, and to identify and correct functional annotation errors which were propagated through homology-based functional annotations.
Related Results
Reflexive homology
Reflexive homology
Reflexive homology is the homology theory associated to the reflexive crossed simplicial group; one of the fundamental crossed simplicial groups. It is the most general way to exte...
Expression and polymorphism of genes in gallstones
Expression and polymorphism of genes in gallstones
ABSTRACT
Through the method of clinical case control study, to explore the expression and genetic polymorphism of KLF14 gene (rs4731702 and rs972283) and SR-B1 gene...
Remote homology search with hidden Potts models
Remote homology search with hidden Potts models
AbstractMost methods for biological sequence homology search and alignment work with primary sequence alone, neglecting higher-order correlations. Recently, statistical physics mod...
A note on Khovanov–Rozansky sl2-homology and ordinary Khovanov homology
A note on Khovanov–Rozansky sl2-homology and ordinary Khovanov homology
In this paper we present an explicit isomorphism between Khovanov–Rozansky sl2-homology and ordinary Khovanov homology. This result was originally claimed in Khovanov and Rozansky'...
Cohomology of Tanabe algebras
Cohomology of Tanabe algebras
In this paper we study the (co)homology of Tanabe algebras, which are a family of subalgebras of the partition algebras exhibiting a Schur–Weyl duality with certain complex reflect...
Pinaceae show elevated rates of gene duplication and gene loss that are robust to incomplete gene annotation
Pinaceae show elevated rates of gene duplication and gene loss that are robust to incomplete gene annotation
Abstract
Gene duplications and gene losses are major determinants of genome evolution and phenotypic diversity. The frequency of gene turnover (gene gains and gene ...
CRISPR-based strategies for targeted transgene knock-in and gene correction
CRISPR-based strategies for targeted transgene knock-in and gene correction
The last few years have seen tremendous advances in CRISPR-mediated genome editing. Great efforts have been made to improve the efficiency, specificity, editing window, and targeti...
The Prognostic Impact of High MEL1 Gene Expression in Pediatric Acute Myeloid Leukemia
The Prognostic Impact of High MEL1 Gene Expression in Pediatric Acute Myeloid Leukemia
Abstract
Background
Acute myeloid leukemia (AML) is a complex disease caused by mutations, epigenetic modifications, and deregulated expression of gen...

