Javascript must be enabled to continue!
RNAsamba: neural network-based assessment of the protein-coding potential of RNA sequences
View through CrossRef
Abstract
The advent of high-throughput sequencing technologies made it possible to obtain large volumes of genetic information, quickly and inexpensively. Thus, many efforts are devoted to unveiling the biological roles of genomic elements, being the distinction between protein-coding and long non-coding RNAs one of the most important tasks. We describe RNAsamba, a tool to predict the coding potential of RNA molecules from sequence information using a neural network-based that models both the whole sequence and the ORF to identify patterns that distinguish coding from non-coding transcripts. We evaluated RNAsamba’s classification performance using transcripts coming from humans and several other model organisms and show that it recurrently outperforms other state-of-the-art methods. Our results also show that RNAsamba can identify coding signals in partial-length ORFs and UTR sequences, evidencing that its algorithm is not dependent on complete transcript sequences. Furthermore, RNAsamba can also predict small ORFs, traditionally identified with ribosome profiling experiments. We believe that RNAsamba will enable faster and more accurate biological findings from genomic data of species that are being sequenced for the first time. A user-friendly web interface, the documentation containing instructions for local installation and usage, and the source code of RNAsamba can be found at https://rnasamba.lge.ibi.unicamp.br/.
Oxford University Press (OUP)
Title: RNAsamba: neural network-based assessment of the protein-coding potential of RNA sequences
Description:
Abstract
The advent of high-throughput sequencing technologies made it possible to obtain large volumes of genetic information, quickly and inexpensively.
Thus, many efforts are devoted to unveiling the biological roles of genomic elements, being the distinction between protein-coding and long non-coding RNAs one of the most important tasks.
We describe RNAsamba, a tool to predict the coding potential of RNA molecules from sequence information using a neural network-based that models both the whole sequence and the ORF to identify patterns that distinguish coding from non-coding transcripts.
We evaluated RNAsamba’s classification performance using transcripts coming from humans and several other model organisms and show that it recurrently outperforms other state-of-the-art methods.
Our results also show that RNAsamba can identify coding signals in partial-length ORFs and UTR sequences, evidencing that its algorithm is not dependent on complete transcript sequences.
Furthermore, RNAsamba can also predict small ORFs, traditionally identified with ribosome profiling experiments.
We believe that RNAsamba will enable faster and more accurate biological findings from genomic data of species that are being sequenced for the first time.
A user-friendly web interface, the documentation containing instructions for local installation and usage, and the source code of RNAsamba can be found at https://rnasamba.
lge.
ibi.
unicamp.
br/.
Related Results
RNAsamba: coding potential assessment using ORF and whole transcript sequence information
RNAsamba: coding potential assessment using ORF and whole transcript sequence information
AbstractMotivationThe advent of high-throughput sequencing technologies made it possible to obtain large volumes of genetic information, quickly and inexpensively. Thus, many effor...
B-247 BLADE-R: streamlined RNA extraction for clinical diagnostics and high-throughput applications
B-247 BLADE-R: streamlined RNA extraction for clinical diagnostics and high-throughput applications
Abstract
Background
Efficient nucleic acid extraction and purification are crucial for cellular and molecular biology research, ...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Molecular Drivers of RNA Phase Separation
Molecular Drivers of RNA Phase Separation
AbstractRNA molecules are essential in orchestrating the assembly of biomolecular condensates and membraneless compartments in cells. Many condensates form via the association of R...
The polyadenosine RNA binding protein Nab2 regulates alternative splicing and intron retention duringDrosophila melanogasterbrain development
The polyadenosine RNA binding protein Nab2 regulates alternative splicing and intron retention duringDrosophila melanogasterbrain development
AbstractThe regulation of cell-specific gene expression patterns during development requires the coordinated actions of hundreds of proteins, including transcription factors, proce...
Accurate in silico predictions of modified RNA interactions to a prototypical RNA-binding protein with λ-dynamics
Accurate in silico predictions of modified RNA interactions to a prototypical RNA-binding protein with λ-dynamics
RNA-binding proteins shape biology through their widespread functions in RNA biochemistry. Their function requires the recognition of specific RNA motifs for targeted binding. Thes...
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract
Background: Fresh frozen (FF) and formalin-fixed paraffin-embedded (FFPE) samples are primary resources for archival tissues in cancer studies. Despite the ...
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract
Introduction
Variant calling based on DNA samples has been the gold standard of clinical testing since the advent of Sanger sequencing. The u...

