Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

FLAS: fast and high-throughput algorithm for PacBio long-read self-correction

View through CrossRef
AbstractMotivationThe third generation PacBio long reads have greatly facilitated sequencing projects with very large read lengths, but they contain about 15% sequencing errors and need error correction. For the projects with long reads only, it is challenging to make correction with fast speed, and also challenging to correct a sufficient amount of read bases, i.e. to achieve high-throughput self-correction. MECAT is currently among the fastest self-correction algorithms, but its throughput is relatively small (Xiao et al., 2017).ResultsHere, we introduce FLAS, a wrapper algorithm of MECAT, to achieve high-throughput long-read self-correction while keeping MECAT’s fast speed. FLAS finds additional alignments from MECAT prealigned long reads to improve the correction throughput, and removes misalignments for accuracy. In addition, FLAS also uses the corrected long-read regions to correct the uncorrected ones to further improve the throughput. In our performance tests on Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana and human long reads, FLAS can achieve 22.0–50.6% larger throughput than MECAT. FLAS is 2–13× faster compared to the self-correction algorithms other than MECAT, and its throughput is also 9.8–281.8% larger. The FLAS corrected long reads can be assembled into contigs of 13.1–29.8% larger N50 sizes than MECAT.Availability and implementationThe FLAS software can be downloaded for free from this site: https://github.com/baoe/flas.Supplementary informationSupplementary data are available at Bioinformatics online.
Title: FLAS: fast and high-throughput algorithm for PacBio long-read self-correction
Description:
AbstractMotivationThe third generation PacBio long reads have greatly facilitated sequencing projects with very large read lengths, but they contain about 15% sequencing errors and need error correction.
For the projects with long reads only, it is challenging to make correction with fast speed, and also challenging to correct a sufficient amount of read bases, i.
e.
to achieve high-throughput self-correction.
MECAT is currently among the fastest self-correction algorithms, but its throughput is relatively small (Xiao et al.
, 2017).
ResultsHere, we introduce FLAS, a wrapper algorithm of MECAT, to achieve high-throughput long-read self-correction while keeping MECAT’s fast speed.
FLAS finds additional alignments from MECAT prealigned long reads to improve the correction throughput, and removes misalignments for accuracy.
In addition, FLAS also uses the corrected long-read regions to correct the uncorrected ones to further improve the throughput.
In our performance tests on Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana and human long reads, FLAS can achieve 22.
0–50.
6% larger throughput than MECAT.
FLAS is 2–13× faster compared to the self-correction algorithms other than MECAT, and its throughput is also 9.
8–281.
8% larger.
The FLAS corrected long reads can be assembled into contigs of 13.
1–29.
8% larger N50 sizes than MECAT.
Availability and implementationThe FLAS software can be downloaded for free from this site: https://github.
com/baoe/flas.
Supplementary informationSupplementary data are available at Bioinformatics online.

Related Results

Contamination of medical devices and hospital environments with free-living amoebae: Evidence from hospitals in Northwestern Iran
Contamination of medical devices and hospital environments with free-living amoebae: Evidence from hospitals in Northwestern Iran
Free-living amoebae (FLAs) are ubiquitous protozoa found in soil, air, and artificial systems, including hospital environments. Some genera of free-living amoebae, such as ...
[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED]Keanu Reeves CBD Gummies ==❱❱ Huge Discounts:[HURRY UP ] Absolute Keanu Reeves CBD Gummies (Available)Order Online Only!! ❰❰= https://www.facebook.com/Keanu-Reeves-CBD-G...
Pacific bioscience sequence technology: Review
Pacific bioscience sequence technology: Review
Pacific Biosciences has developed a platform that may sequence one molecule of DNA in a period via the polymerization of that strand with one enzyme. Single-molecule real-time sequ...
Is a Fitbit a Diary? Self-Tracking and Autobiography
Is a Fitbit a Diary? Self-Tracking and Autobiography
Data becomes something of a mirror in which people see themselves reflected. (Sorapure 270)In a 2014 essay for The New Yorker, the humourist David Sedaris recounts an obsession spu...
An integrated model for process parameter adjustment to recover throughput shortage in semiconductor assembly: A case study
An integrated model for process parameter adjustment to recover throughput shortage in semiconductor assembly: A case study
Purpose: Existing productivity improvements activities such as inventory buffer, overall equipment effectiveness (OEE) and total productive maintenance (TPM) do not analytically as...
LongTron: Automated Analysis of Long Read Spliced Alignment Accuracy
LongTron: Automated Analysis of Long Read Spliced Alignment Accuracy
Abstract Motivation Long read sequencing has increased the accuracy and completeness of assemblies of various organisms’ genome...

Back to Top