Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

LongTron: Automated Analysis of Long Read Spliced Alignment Accuracy

View through CrossRef
Abstract Motivation Long read sequencing has increased the accuracy and completeness of assemblies of various organisms’ genomes in recent months. Similarly, spliced alignments of long read RNA sequencing hold the promise of delivering much longer transcripts of existing and novel isoforms in known genes without the need for error-prone transcript assemblies from short reads. However, low coverage and high-error rates potentially hamper the widespread adoption of long-read spliced alignments in annotation updates and isoform-level expression quantifications. Results Addressing these issues, we first develop a simulation of error modes for both Oxford Nanopore and PacBio CCS spliced-alignments. Based on this we train a Random Forest classifier to assign new long-read alignments to one of two error categories, a novel category, or label them as non-error. We use this classifier to label reads from the spliced-alignments of the popular aligner minimap2, run on three long read sequencing datasets, including NA12878 from Oxford Nanopore and PacBio CCS, as well as a PacBio SKBR3 cancer cell line. Finally, we compare the intron chains of the three long read alignments against individual splice sites, short read assemblies, and the output from the FLAIR pipeline on the same samples. Our results demonstrate a substantial lack of precision in determining exact splice sites for long reads during alignment on both platforms while showing some benefit from postprocessing. This work motivates the need for both better aligners and additional post-alignment processing to adjust incorrectly called putative splice-sites and clarify novel transcripts support. Availability and implementation Source code for the random forest implemented in python is available at https://github.com/schatzlab/LongTron under the MIT license. The modified version of GffCompare used to construct Table 3 and related is here: https://github.com/ChristopherWilks/gffcompare/releases/tag/0.11.2LT Supplementary Information Supplementary notes and figures are available online.
Title: LongTron: Automated Analysis of Long Read Spliced Alignment Accuracy
Description:
Abstract Motivation Long read sequencing has increased the accuracy and completeness of assemblies of various organisms’ genomes in recent months.
Similarly, spliced alignments of long read RNA sequencing hold the promise of delivering much longer transcripts of existing and novel isoforms in known genes without the need for error-prone transcript assemblies from short reads.
However, low coverage and high-error rates potentially hamper the widespread adoption of long-read spliced alignments in annotation updates and isoform-level expression quantifications.
Results Addressing these issues, we first develop a simulation of error modes for both Oxford Nanopore and PacBio CCS spliced-alignments.
Based on this we train a Random Forest classifier to assign new long-read alignments to one of two error categories, a novel category, or label them as non-error.
We use this classifier to label reads from the spliced-alignments of the popular aligner minimap2, run on three long read sequencing datasets, including NA12878 from Oxford Nanopore and PacBio CCS, as well as a PacBio SKBR3 cancer cell line.
Finally, we compare the intron chains of the three long read alignments against individual splice sites, short read assemblies, and the output from the FLAIR pipeline on the same samples.
Our results demonstrate a substantial lack of precision in determining exact splice sites for long reads during alignment on both platforms while showing some benefit from postprocessing.
This work motivates the need for both better aligners and additional post-alignment processing to adjust incorrectly called putative splice-sites and clarify novel transcripts support.
Availability and implementation Source code for the random forest implemented in python is available at https://github.
com/schatzlab/LongTron under the MIT license.
The modified version of GffCompare used to construct Table 3 and related is here: https://github.
com/ChristopherWilks/gffcompare/releases/tag/0.
11.
2LT Supplementary Information Supplementary notes and figures are available online.

Related Results

[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED]Keanu Reeves CBD Gummies ==❱❱ Huge Discounts:[HURRY UP ] Absolute Keanu Reeves CBD Gummies (Available)Order Online Only!! ❰❰= https://www.facebook.com/Keanu-Reeves-CBD-G...
EASTR: Correcting systematic alignment errors in multi-exon genes
EASTR: Correcting systematic alignment errors in multi-exon genes
Abstract Accurate alignment of transcribed RNA to reference genomes is a critical step in the analysis of gene expression, which in turn has broa...
Ancestral sequence alignment under optimal conditions
Ancestral sequence alignment under optimal conditions
Abstract Background Multiple genome alignment is an important problem in bioinformatics. An important subproblem used by many multiple alignment app...
Parametric Study and Full-Stage Moment-Rotation Behavior of Bolt-Spliced Joints in Precast H-shaped Steel Support Structure
Parametric Study and Full-Stage Moment-Rotation Behavior of Bolt-Spliced Joints in Precast H-shaped Steel Support Structure
Precast H-shaped steel support structures are typically connected by end plate and cover plate connections (E-CPC). This connection method displays semi-rigid behavior. The rationa...
Novel Spliced MLL Fusions Have Been Identified Involving the MLL Partner Genes ELL, EPS15, MLLT3, and SEPT5.
Novel Spliced MLL Fusions Have Been Identified Involving the MLL Partner Genes ELL, EPS15, MLLT3, and SEPT5.
Abstract Chromosomal rearrangements of the human MLL gene are a genetic hallmark for aggressive acute leukemias. More than 60 partner genes have been characterized a...
Errata
Errata
Part I. Page 147 line 13, for 25, read 2, 5. Part II. Page 298 line 2, insert the Rev . after By...
Magnetic alignment technology for wafer bonding
Magnetic alignment technology for wafer bonding
Purpose Wafer bonding is a key process for 3 D advanced packaging of integrated circuits. It requires very high accuracy for the wafer alignment. To solve the problems of large mov...
Ontology Alignment Techniques
Ontology Alignment Techniques
Sometimes the use of a single ontology is not sufficient to cover different vocabularies for the same domain, and it becomes necessary to use several ontologies in order to encompa...

Back to Top