Javascript must be enabled to continue!
How Error Correction Affects PCR Deduplication: A Survey Based on UMI Datasets of Short Reads
View through CrossRef
AbstractNext-Generation Sequencing (NGS) data is widely utilised for various downstream applications in bioinformatics, and numerous techniques have been developed forPCR-deduplicationanderror-correctionto eliminate bias and errors introduced during the sequencing. This study first-time provides a joint overview of recent advances in PCR-deduplication and error-correction on short reads. In particular, we utilise UMI-based PCR-deduplication strategies and sequencing data to assess the performance of the solely-computational PCR-deduplication approaches and investigate how error correction affects the performance of PCR-deduplication. Our survey and comparative analysis reveal that the deduplicated reads generated by the solely-computational PCR-deduplication and error-correction methods exhibit substantial differences and divergence from the sets of reads obtained by the UMI-based deduplication methods. The existing solely-computational PCR-deduplication and error-correction tools can eliminate some errors but still leave hundreds of thousands of erroneous reads uncorrected. All the error-correction approaches raise thousands or more new sequences after correction which do not have any benefit to the PCR-deduplication process. Upon these discoveries, we offer practical suggestions to enhance the existing computational approaches for improving the quality of short-read sequencing data.
Title: How Error Correction Affects PCR Deduplication: A Survey Based on UMI Datasets of Short Reads
Description:
AbstractNext-Generation Sequencing (NGS) data is widely utilised for various downstream applications in bioinformatics, and numerous techniques have been developed forPCR-deduplicationanderror-correctionto eliminate bias and errors introduced during the sequencing.
This study first-time provides a joint overview of recent advances in PCR-deduplication and error-correction on short reads.
In particular, we utilise UMI-based PCR-deduplication strategies and sequencing data to assess the performance of the solely-computational PCR-deduplication approaches and investigate how error correction affects the performance of PCR-deduplication.
Our survey and comparative analysis reveal that the deduplicated reads generated by the solely-computational PCR-deduplication and error-correction methods exhibit substantial differences and divergence from the sets of reads obtained by the UMI-based deduplication methods.
The existing solely-computational PCR-deduplication and error-correction tools can eliminate some errors but still leave hundreds of thousands of erroneous reads uncorrected.
All the error-correction approaches raise thousands or more new sequences after correction which do not have any benefit to the PCR-deduplication process.
Upon these discoveries, we offer practical suggestions to enhance the existing computational approaches for improving the quality of short-read sequencing data.
Related Results
AVOIDANCE OF DUPLICACY AND COMPELLING CLOUD SECURITY INDIFFERENT CLOUD SITUATIONS
AVOIDANCE OF DUPLICACY AND COMPELLING CLOUD SECURITY INDIFFERENT CLOUD SITUATIONS
Data deduplication is necessary for making data smaller and preventing duplication when transferring it. It is often used in cloud computing to increase the amount of data that can...
Abstract 2113: A wild-type-blocking reference sequence enhances COLD-PCR and enables fast amplification and high enrichment of all types of low-prevalence unknown mutations
Abstract 2113: A wild-type-blocking reference sequence enhances COLD-PCR and enables fast amplification and high enrichment of all types of low-prevalence unknown mutations
Abstract
Background: Molecular profiling of somatic mutations in cancer often requires the identification of low-prevalence DNA mutations in an excess of wild-type (...
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
Human tissues comprise trillions of cells that populate a complex space of molecular phenotypes and functions and that vary in abundance by 4–9 orders of magnitude. Relying solely ...
VASD2OM: Virtual Auditing and Secure Deduplication with Dynamic Ownership Management in Cloud
VASD2OM: Virtual Auditing and Secure Deduplication with Dynamic Ownership Management in Cloud
In cloud repository amenities, deduplication technology is often utilized to minimize the volume and bandwidth by removing repetitious information and caching only a solitary dupli...
GraphK-LR: Enhancing Long-read Metagenomic Binning with Read-overlap Graphs Across Microbial Kingdoms
GraphK-LR: Enhancing Long-read Metagenomic Binning with Read-overlap Graphs Across Microbial Kingdoms
Abstract
Background: Metagenomics, the study of genetic material from environmental samples, relies on binning - the process of grouping DNA sequences from the same organis...
Deep Learning Phase Error Correction for Cerebrovascular 4D Flow MRI
Deep Learning Phase Error Correction for Cerebrovascular 4D Flow MRI
Abstract
Background and Purpose
Background phase errors in 4D Flow MRI may negatively impact blood flow quantification. In this study, we assessed their impact on cerebrov...
Public Audit and Secure Deduplication in Cloud Storage using BLS signature
Public Audit and Secure Deduplication in Cloud Storage using BLS signature
Although many researches have been done individually on each topic of secure
deduplication andintegrity auditing, the study of the combined model of these two
...
Evaluasi Kesesuaian Resep Pasien Umum Rawat Jalan Terhadap Formularium di RS Umi Barokah Boyolali
Evaluasi Kesesuaian Resep Pasien Umum Rawat Jalan Terhadap Formularium di RS Umi Barokah Boyolali
Background: Formularies are an excellent tool to improve the quality and efficiency of hospital treatment costs and can demonstrate the level of effectiveness in achieving therapeu...

