Javascript must be enabled to continue!
Evaluation of Metagenome Binning: Advances and Challenges
View through CrossRef
Abstract
Background
Several recent deep learning methods for metagenome binning claim improvements in the recovery of high quality metagenome-assembled genomes. These methods differ in their approaches to learn the contig embeddings and to cluster them. Rapid advances in binning require rigorous benchmarking to evaluate the effectiveness of new methods. We have benchmarked newly developed state-of-the-art deep learning binners on CAMI2 datasets, including our own, McDevol.
Results
The results show that COMEBin and GenomeFace give the best binning accuracy, although not always the best embedding accuracy. Interestingly, post-binning reassembly consistently improves the quality of low coverage bins. We find that binning coassembled contigs with multi-sample coverage is effective for low coverage dataset while binning multi-sample contigs with multi-sample coverage (‘multi-sample’) is effective for high-coverage samples. In multi-sample binning, splitting the embedding space by sample before clustering showed enhanced performance compared to the standard approach of splitting final clusters by sample.
Conclusions
COMEBin and GenomeFace emerged as the top-performing tools overall, with MetaBAT2 and GenomeFace demonstrating superior speed. To facilitate future development, we provide workflows for standardized benchmarking of metagenome binners.
Title: Evaluation of Metagenome Binning: Advances and Challenges
Description:
Abstract
Background
Several recent deep learning methods for metagenome binning claim improvements in the recovery of high quality metagenome-assembled genomes.
These methods differ in their approaches to learn the contig embeddings and to cluster them.
Rapid advances in binning require rigorous benchmarking to evaluate the effectiveness of new methods.
We have benchmarked newly developed state-of-the-art deep learning binners on CAMI2 datasets, including our own, McDevol.
Results
The results show that COMEBin and GenomeFace give the best binning accuracy, although not always the best embedding accuracy.
Interestingly, post-binning reassembly consistently improves the quality of low coverage bins.
We find that binning coassembled contigs with multi-sample coverage is effective for low coverage dataset while binning multi-sample contigs with multi-sample coverage (‘multi-sample’) is effective for high-coverage samples.
In multi-sample binning, splitting the embedding space by sample before clustering showed enhanced performance compared to the standard approach of splitting final clusters by sample.
Conclusions
COMEBin and GenomeFace emerged as the top-performing tools overall, with MetaBAT2 and GenomeFace demonstrating superior speed.
To facilitate future development, we provide workflows for standardized benchmarking of metagenome binners.
Related Results
Evaluation of metagenome binning: advances and challenges
Evaluation of metagenome binning: advances and challenges
Abstract
Several recent deep learning methods for metagenome binning claim improvements in the recovery of high-quality metagenome-assembled genomes. These method...
GraphK-LR: Enhancing Long-read Metagenomic Binning with Read-overlap Graphs Across Microbial Kingdoms
GraphK-LR: Enhancing Long-read Metagenomic Binning with Read-overlap Graphs Across Microbial Kingdoms
Abstract
Background: Metagenomics, the study of genetic material from environmental samples, relies on binning - the process of grouping DNA sequences from the same organis...
Effect of data binning and frame averaging for micro-CT image acquisition on the morphometric outcome of bone repair assessment
Effect of data binning and frame averaging for micro-CT image acquisition on the morphometric outcome of bone repair assessment
AbstractDespite the current advances in micro-CT analysis, the influence of some image acquisition parameters on the morphometric assessment outcome have not been fully elucidated....
CoCoBin: Graph-Based Metagenomic Binning via Composition–Coverage Separation
CoCoBin: Graph-Based Metagenomic Binning via Composition–Coverage Separation
Abstract
Motivation
Metagenomic binning is a critical step in metagenomic analysis, aiming to cluster contigs from the same genome into c...
Pixel Binning in Digital Radiography Imaging
Pixel Binning in Digital Radiography Imaging
In digital radiography imaging, pixel binning is an effective way to reduce the amount of image data for transmission or storage, and is particularly effective for application to d...
YAMB: metagenome binning using nonlinear dimensionality reduction and density-based clustering
YAMB: metagenome binning using nonlinear dimensionality reduction and density-based clustering
AbstractSummaryYAMB is a novel metagenome binning tool, which uses tetranucletotide composition and average contig coverage and performs t-SNE dimensionality reduction and sequenti...
Critical assessment of pan-genomics of metagenome-assembled genomes
Critical assessment of pan-genomics of metagenome-assembled genomes
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years. As a result, millions of M...
BusyBee Web: towards comprehensive and differential composition-based metagenomic binning
BusyBee Web: towards comprehensive and differential composition-based metagenomic binning
Abstract
Despite recent methodology and reference database improvements for taxonomic profiling tools, metagenomic assembly and genomic binning remain important pill...

