Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

TrancriptomeReconstructoR: data-driven annotation of complex transcriptomes

View through CrossRef
AbstractBackgroundThe quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines forde novotranscriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data.ResultsWe developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5’ and 3’ tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructedde novothe transcriptional landscape of wild typeArabidopsis thalianaseedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality ofA.thalianagenome research.ConclusionsOur proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.
Title: TrancriptomeReconstructoR: data-driven annotation of complex transcriptomes
Description:
AbstractBackgroundThe quality of gene annotation determines the interpretation of results obtained in transcriptomic studies.
The growing number of genome sequence information calls for experimental and computational pipelines forde novotranscriptome annotation.
Ideally, gene and transcript models should be called from a limited set of key experimental data.
ResultsWe developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation.
It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5’ and 3’ tag sequencing data for accurate definition of gene borders.
The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.
We reconstructedde novothe transcriptional landscape of wild typeArabidopsis thalianaseedlings as a proof-of-principle.
A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11.
In particular, we identify thousands of transient transcripts missing from the existing annotations.
Our new annotation promises to improve the quality ofA.
thalianagenome research.
ConclusionsOur proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes.
We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package.
The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome.
The package seamlessly integrates with Bioconductor packages for downstream analysis.

Related Results

TrancriptomeReconstructoR, A Data-Driven Annotation of Complex Transcriptomes
TrancriptomeReconstructoR, A Data-Driven Annotation of Complex Transcriptomes
Abstract Background: The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence infor...
QALB: Qatar Arabic language bank
QALB: Qatar Arabic language bank
Automatic text correction has been attracting research attention for English and some other western languages. Applications for automatic text correction vary from improving langua...
Mining sequence annotation databanks for association patterns
Mining sequence annotation databanks for association patterns
Abstract Motivation: Millions of protein sequences currently being deposited to sequence databanks will never be annotated manually. Similarity-based annotation gene...
Environmental transcriptomes of invasive dreissena, a model species in ecotoxicology and invasion biology
Environmental transcriptomes of invasive dreissena, a model species in ecotoxicology and invasion biology
AbstractDreissenids are established model species for ecological and ecotoxicological studies, since they are sessile and filter feeder organisms and reflect in situ freshwater qua...
Development and Evaluation of Gold Standard Dataset for Sentiment Analysis of Tweets
Development and Evaluation of Gold Standard Dataset for Sentiment Analysis of Tweets
Pre-labeled data is typically required for supervised machine learning. A limited number of object classes in the majority of open access and pre-annotated datasets make them unsui...
Applying negative rule mining to improve genome annotation
Applying negative rule mining to improve genome annotation
Abstract Background Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurious functional assignments...
Zero-shot reconstruction of mutant spatial transcriptomes
Zero-shot reconstruction of mutant spatial transcriptomes
Mutant analysis is the core of biological/pathological research, and measuring spatial gene expression can facilitate the understanding of the disorganised tissue phenotype ...
Automated annotation in UniProt
Automated annotation in UniProt
UniProt is a high quality, comprehensive protein resource in which the core activity is the expert review and annotation of proteins where the function has been experimentally inve...

Back to Top