Javascript must be enabled to continue!
GraphK-LR: Enhancing Long-read Metagenomic Binning with Read-overlap Graphs Across Microbial Kingdoms
View through CrossRef
Abstract
Background: Metagenomics, the study of genetic material from environmental samples, relies on binning - the process of grouping DNA sequences from the same organism to disentangle complex species mixtures. Recently, metagenomics has shown a rising interest in using long-reads from third-generation sequencing technologies to overcome the limitations of short-reads. These long-reads contain species-specific signals for direct grouping into taxonomic bins prior to assembly. Previous studies have successfully used nucleotide composition and coverage for binning long-reads. The advent of less error-prone sequencing technologies has paved the way for incorporating additional information to enhance binning accuracy. In this paper, we introduce GraphK-LR, a long-read binning refiner that uses connectivity information between the reads and machine-learning-based graph techniques to refine potentially misclassified reads from an initial binning tool. Additionally, our tool uses marker-gene-based kingdom-level analysis to address the challenge of species from different microbial kingdoms being present in the same metagenomic sample, making it complex to bin using existing tools. This approach is inspired by the multitude of short-read refiners, addressing the gap in the unavailability of refining tools for long reads.
Results: We evaluated the tool using publicly available mock community datasets sequenced with Oxford Nanopore R.10.x chemistry, initially binned using the existing tools OBLR and LRBinner. Upon refinement, we observed a marginal improvement of 2-3\% in binning accuracy, which indicates that both these tools are highly effective at correctly binning reads. Another long-reads binning tool named SemiBin2, discarded nearly 20\% of reads while binning, and after refinement we observed a significant improvement ranging from 20-30\% in evaluation criteria. These results demonstrate that GraphK-LR adds an additional layer of accuracy over the binning tools, particularly in cases with unclassified reads.
Conclusion: Although there is still room for further enhancement, our tool represents an important initial step in exploring the capacity to further improve the accuracy of long-read binning by combining existing methods with more sophisticated techniques. The underlying concept of GraphK-LR holds the potential to advance long-read-based metagenomics analyses across a wide range of applications. The source code for GraphK-LR can be found at https://github.com/NethmiRanasinghe/GraphK-LR.
Springer Science and Business Media LLC
Title: GraphK-LR: Enhancing Long-read Metagenomic Binning with Read-overlap Graphs Across Microbial Kingdoms
Description:
Abstract
Background: Metagenomics, the study of genetic material from environmental samples, relies on binning - the process of grouping DNA sequences from the same organism to disentangle complex species mixtures.
Recently, metagenomics has shown a rising interest in using long-reads from third-generation sequencing technologies to overcome the limitations of short-reads.
These long-reads contain species-specific signals for direct grouping into taxonomic bins prior to assembly.
Previous studies have successfully used nucleotide composition and coverage for binning long-reads.
The advent of less error-prone sequencing technologies has paved the way for incorporating additional information to enhance binning accuracy.
In this paper, we introduce GraphK-LR, a long-read binning refiner that uses connectivity information between the reads and machine-learning-based graph techniques to refine potentially misclassified reads from an initial binning tool.
Additionally, our tool uses marker-gene-based kingdom-level analysis to address the challenge of species from different microbial kingdoms being present in the same metagenomic sample, making it complex to bin using existing tools.
This approach is inspired by the multitude of short-read refiners, addressing the gap in the unavailability of refining tools for long reads.
Results: We evaluated the tool using publicly available mock community datasets sequenced with Oxford Nanopore R.
10.
x chemistry, initially binned using the existing tools OBLR and LRBinner.
Upon refinement, we observed a marginal improvement of 2-3\% in binning accuracy, which indicates that both these tools are highly effective at correctly binning reads.
Another long-reads binning tool named SemiBin2, discarded nearly 20\% of reads while binning, and after refinement we observed a significant improvement ranging from 20-30\% in evaluation criteria.
These results demonstrate that GraphK-LR adds an additional layer of accuracy over the binning tools, particularly in cases with unclassified reads.
Conclusion: Although there is still room for further enhancement, our tool represents an important initial step in exploring the capacity to further improve the accuracy of long-read binning by combining existing methods with more sophisticated techniques.
The underlying concept of GraphK-LR holds the potential to advance long-read-based metagenomics analyses across a wide range of applications.
The source code for GraphK-LR can be found at https://github.
com/NethmiRanasinghe/GraphK-LR.
Related Results
CoCoBin: Graph-Based Metagenomic Binning via Composition–Coverage Separation
CoCoBin: Graph-Based Metagenomic Binning via Composition–Coverage Separation
Abstract
Motivation
Metagenomic binning is a critical step in metagenomic analysis, aiming to cluster contigs from the same genome into c...
CAIM: Coverage-based Analysis for Identification of Microbiome
CAIM: Coverage-based Analysis for Identification of Microbiome
ABSTRACTAccurate taxonomic profiling of microbial taxa in a metagenomic sample is vital to gain insights into microbial ecology. Recent advancements in sequencing technologies have...
Effect of data binning and frame averaging for micro-CT image acquisition on the morphometric outcome of bone repair assessment
Effect of data binning and frame averaging for micro-CT image acquisition on the morphometric outcome of bone repair assessment
AbstractDespite the current advances in micro-CT analysis, the influence of some image acquisition parameters on the morphometric assessment outcome have not been fully elucidated....
BusyBee Web: towards comprehensive and differential composition-based metagenomic binning
BusyBee Web: towards comprehensive and differential composition-based metagenomic binning
Abstract
Despite recent methodology and reference database improvements for taxonomic profiling tools, metagenomic assembly and genomic binning remain important pill...
Metagenomic Thermometer
Metagenomic Thermometer
AbstractVarious microorganisms exist in environments, and each of which has an optimal growth temperature (OGT). The relationship between genomic information and OGT of each specie...
Pixel Binning in Digital Radiography Imaging
Pixel Binning in Digital Radiography Imaging
In digital radiography imaging, pixel binning is an effective way to reduce the amount of image data for transmission or storage, and is particularly effective for application to d...
Metagenomic Thermometer
Metagenomic Thermometer
Abstract
Various microorganisms exist in environments, and each of them has its optimal growth temperature (OGT). The relationship between genomic information and OG...
LMAS: evaluating metagenomic short de novo assembly methods through defined communities
LMAS: evaluating metagenomic short de novo assembly methods through defined communities
Abstract
Background
The de novo assembly of raw sequence data is key in metagenomic analysis. It allows recovering draft genomes...

