Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Long-read error correction: a survey and qualitative comparison

View through CrossRef
Abstract Third generation sequencing technologies Pacific Biosciences and Oxford Nanopore Technologies were respectively made available in 2011 and 2014. In contrast with second generation sequencing technologies such as Illumina, these new technologies allow the sequencing of long reads of tens to hundreds of kbp. These so called long reads are particularly promising, and are especially expected to solve various problems such as contig and haplotype assembly or scaffolding, for instance. However, these reads are also much more error prone than second generation reads, and display error rates reaching 10 to 30%, according to the sequencing technology and to the version of the chemistry. Moreover, these errors are mainly composed of insertions and deletions, whereas most errors are substitutions in Illumina reads. As a result, long reads require efficient error correction, and a plethora of error correction tools, directly targeted at these reads, were developed in the past ten years. These methods can adopt a hybrid approach, using complementary short reads to perform correction, or a self-correction approach, only making use of the information contained in the long reads sequences. Both these approaches make use of various strategies such as multiple sequence alignment, de Bruijn graphs, Hidden Markov Models, or even combine different strategies. In this paper, we describe a complete survey of long-read error correction, reviewing all the different methodologies and tools existing up to date, for both hybrid and self-correction. Moreover, the long reads characteristics, such as sequencing depth, length, error rate, or even sequencing technology, have huge impacts on how well a given tool or strategy performs, and can thus drastically reduce the correction quality. We thus also present an in-depth benchmark of available long-read error correction tools, on a wide variety of datasets, composed of both simulated and real data, with various error rates, coverages, and read lengths, ranging from small bacterial to large mammal genomes.
Title: Long-read error correction: a survey and qualitative comparison
Description:
Abstract Third generation sequencing technologies Pacific Biosciences and Oxford Nanopore Technologies were respectively made available in 2011 and 2014.
In contrast with second generation sequencing technologies such as Illumina, these new technologies allow the sequencing of long reads of tens to hundreds of kbp.
These so called long reads are particularly promising, and are especially expected to solve various problems such as contig and haplotype assembly or scaffolding, for instance.
However, these reads are also much more error prone than second generation reads, and display error rates reaching 10 to 30%, according to the sequencing technology and to the version of the chemistry.
Moreover, these errors are mainly composed of insertions and deletions, whereas most errors are substitutions in Illumina reads.
As a result, long reads require efficient error correction, and a plethora of error correction tools, directly targeted at these reads, were developed in the past ten years.
These methods can adopt a hybrid approach, using complementary short reads to perform correction, or a self-correction approach, only making use of the information contained in the long reads sequences.
Both these approaches make use of various strategies such as multiple sequence alignment, de Bruijn graphs, Hidden Markov Models, or even combine different strategies.
In this paper, we describe a complete survey of long-read error correction, reviewing all the different methodologies and tools existing up to date, for both hybrid and self-correction.
Moreover, the long reads characteristics, such as sequencing depth, length, error rate, or even sequencing technology, have huge impacts on how well a given tool or strategy performs, and can thus drastically reduce the correction quality.
We thus also present an in-depth benchmark of available long-read error correction tools, on a wide variety of datasets, composed of both simulated and real data, with various error rates, coverages, and read lengths, ranging from small bacterial to large mammal genomes.

Related Results

[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED]Keanu Reeves CBD Gummies ==❱❱ Huge Discounts:[HURRY UP ] Absolute Keanu Reeves CBD Gummies (Available)Order Online Only!! ❰❰= https://www.facebook.com/Keanu-Reeves-CBD-G...
NICU Medication Errors: Describing the Cause and Nature of Medication Errors in a NICU in Qatar
NICU Medication Errors: Describing the Cause and Nature of Medication Errors in a NICU in Qatar
IntroductionA medication error can be defined as “any error occurring in the medication use process” and focuses on problems with the delivery of medication to a patient [1]. Medic...
Deep Learning Phase Error Correction for Cerebrovascular 4D Flow MRI
Deep Learning Phase Error Correction for Cerebrovascular 4D Flow MRI
Abstract Background and Purpose Background phase errors in 4D Flow MRI may negatively impact blood flow quantification. In this study, we assessed their impact on cerebrov...
Deep learning phase error correction for cerebrovascular 4D flow MRI
Deep learning phase error correction for cerebrovascular 4D flow MRI
Abstract Background phase errors in 4D Flow MRI may negatively impact blood flow quantification. In this study, we assessed their impact on c...
FLAS: fast and high-throughput algorithm for PacBio long-read self-correction
FLAS: fast and high-throughput algorithm for PacBio long-read self-correction
AbstractMotivationThe third generation PacBio long reads have greatly facilitated sequencing projects with very large read lengths, but they contain about 15% sequencing errors and...
Well Collision Risk Management by Fully Automated Wellbore Surveying
Well Collision Risk Management by Fully Automated Wellbore Surveying
Abstract A survey program is designed for every well to meet the well objective of penetrating the target reservoir and avoid colliding with other offset wells. The ...
Errata
Errata
Part I. Page 147 line 13, for 25, read 2, 5. Part II. Page 298 line 2, insert the Rev . after By...
A new tropospheric error model for ground-based GNSS interferometric reflectometry: theory and validation
A new tropospheric error model for ground-based GNSS interferometric reflectometry: theory and validation
We deduce a new tropospheric error model for ground-based GNSS inter-ferometric reflectometry (GNSS-IR), the NITE (New Interferometric Tropo-spheric Error) model. This model contai...

Back to Top