Javascript must be enabled to continue!
Transcript Assembly and Annotations: Bias and Adjustment
View through CrossRef
AbstractMotivationTranscript annotations play a critical role in gene expression analysis as they serve as a reference for quantifying isoform-level expression. The two main sources of annotations are RefSeq and Ensembl/GENCODE, but discrepancies between their methodologies and information resources can lead to significant differences. It has been demonstrated that the choice of annotation can have a significant impact on gene expression analysis. Furthermore, transcript assembly is closely linked to annotations, as assembling large-scale available RNA-seq data is an effective data-driven way to construct annotations, and annotations are often served as benchmarks to evaluate the accuracy of assembly methods. However, the influence of different annotations on transcript assembly is not yet fully understood.ResultsWe investigate the impact of annotations on transcript assembly. We observe that conflicting conclusions can arise when evaluating assemblers with different annotations. To understand this striking phenomenon, we compare the structural similarity of annotations at various levels and find that the primary structural difference across annotations occurs at the intron-chain level. Next, we examine the biotypes of annotated and assembled transcripts and uncover a significant bias towards annotating and assembling transcripts with intron retentions, which explains above the contradictory conclusions. We develop a standalone tool, available athttps://github.com/Shao-Group/irtool, that can be combined with an assembler to generate an assembly without intron retentions. We evaluate the performance of such a pipeline and offer guidance to select appropriate assembling tools for different application scenarios.
Title: Transcript Assembly and Annotations: Bias and Adjustment
Description:
AbstractMotivationTranscript annotations play a critical role in gene expression analysis as they serve as a reference for quantifying isoform-level expression.
The two main sources of annotations are RefSeq and Ensembl/GENCODE, but discrepancies between their methodologies and information resources can lead to significant differences.
It has been demonstrated that the choice of annotation can have a significant impact on gene expression analysis.
Furthermore, transcript assembly is closely linked to annotations, as assembling large-scale available RNA-seq data is an effective data-driven way to construct annotations, and annotations are often served as benchmarks to evaluate the accuracy of assembly methods.
However, the influence of different annotations on transcript assembly is not yet fully understood.
ResultsWe investigate the impact of annotations on transcript assembly.
We observe that conflicting conclusions can arise when evaluating assemblers with different annotations.
To understand this striking phenomenon, we compare the structural similarity of annotations at various levels and find that the primary structural difference across annotations occurs at the intron-chain level.
Next, we examine the biotypes of annotated and assembled transcripts and uncover a significant bias towards annotating and assembling transcripts with intron retentions, which explains above the contradictory conclusions.
We develop a standalone tool, available athttps://github.
com/Shao-Group/irtool, that can be combined with an assembler to generate an assembly without intron retentions.
We evaluate the performance of such a pipeline and offer guidance to select appropriate assembling tools for different application scenarios.
Related Results
Gene function finding through cross-organism ensemble learning
Gene function finding through cross-organism ensemble learning
Abstract
Background
Structured biological information about genes and proteins is a valuable resource to improve discovery and understanding of comp...
Importance of transcript variants in transcriptome analyses
Importance of transcript variants in transcriptome analyses
Abstract
RNA sequencing (RNA-Seq) has become a widely adopted genome-wide technique for investigating gene expression patterns. However, conventi...
MAISA - Maintenance of semantic annotations
MAISA - Maintenance of semantic annotations
MAISA - Maintenance des annotations sémantiques
Les annotations sémantiques sont utilisées dans de nombreux domaines comme celui de la santé et servent à différente...
Precise Transcript Reconstruction with End-Guided Assembly
Precise Transcript Reconstruction with End-Guided Assembly
ABSTRACT
Accurate annotation of transcript isoforms is crucial to understand gene functions, but automated methods for reconstructing full-length transcripts from R...
Development and Applications of the SCARA Robot
Development and Applications of the SCARA Robot
In the 1980s, when the author worked for Seiko Epson Corporation as a wristwatch production engineer, consumer needs had become so diversified that wristwatches had to be assembled...
Tropical Indian Ocean Mixed Layer Bias in CMIP6 CGCMs Primarily Attributed tothe AGCM Surface Wind Bias
Tropical Indian Ocean Mixed Layer Bias in CMIP6 CGCMs Primarily Attributed tothe AGCM Surface Wind Bias
The relatively weak sea surface temperature bias in the tropical Indian Ocean (TIO) simulated in the coupledgeneral circulation model (CGCM) from the recently released CMIP6 has be...
Hubungan College Adjustment dengan Stres Akademik Mahasiswa Tahun Pertama Fakultas Kedokteran
Hubungan College Adjustment dengan Stres Akademik Mahasiswa Tahun Pertama Fakultas Kedokteran
Abstract. First-year students experience a transitional period in adapting to complex learning systems and academic and social demands in higher education. This study aims to exami...
High Rate of BCR-ABL Transcript Undetectability Achieved by Treating with Imatinib Mesylate Very Late CML Patients in Stable CCR after IFN.
High Rate of BCR-ABL Transcript Undetectability Achieved by Treating with Imatinib Mesylate Very Late CML Patients in Stable CCR after IFN.
Abstract
Background. Interferon alfa (IFN a) induces complete cytogenetic response (CCR) in small proportion of CML patients, with almost all of these patients still...

