Javascript must be enabled to continue!
Critical assessment of pan-genomics of metagenome-assembled genomes
View through CrossRef
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years. As a result, millions of MAGs have been produced and increasingly included in pan-genomics workflow. However, pan-genome analyses of MAGs may suffer from the known issues with MAGs: fragmentation, incompleteness, and contamination, due to mis-assembly and mis-binning. Here, we conducted a critical assessment of including MAGs in pan-genome analysis, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs.ResultsWe found that incompleteness led to more significant core gene loss than fragmentation. Contamination had little effect on core genome size but had major influence on accessory genomes. The core gene loss remained when using different pan-genome analysis tools and when using a mixture of MAGs and complete genomes. Importantly, the core gene loss was partially alleviated by lowering the core gene threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The core gene loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees.ConclusionsWe conclude that lowering core gene threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs to alleviate the accuracy loss. Better quality control of MAGs and development of new pan-genome analysis tools specifically designed for MAGs are needed in future studies.
Title: Critical assessment of pan-genomics of metagenome-assembled genomes
Description:
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years.
As a result, millions of MAGs have been produced and increasingly included in pan-genomics workflow.
However, pan-genome analyses of MAGs may suffer from the known issues with MAGs: fragmentation, incompleteness, and contamination, due to mis-assembly and mis-binning.
Here, we conducted a critical assessment of including MAGs in pan-genome analysis, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs.
ResultsWe found that incompleteness led to more significant core gene loss than fragmentation.
Contamination had little effect on core genome size but had major influence on accessory genomes.
The core gene loss remained when using different pan-genome analysis tools and when using a mixture of MAGs and complete genomes.
Importantly, the core gene loss was partially alleviated by lowering the core gene threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%.
The core gene loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees.
ConclusionsWe conclude that lowering core gene threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs to alleviate the accuracy loss.
Better quality control of MAGs and development of new pan-genome analysis tools specifically designed for MAGs are needed in future studies.
Related Results
Characterization of metagenome-assembled genomes of two endo-archaea of Candida tropicalis
Characterization of metagenome-assembled genomes of two endo-archaea of Candida tropicalis
IntroductionHost-microbe interactions are pivotal in host biology, ecology, and evolution. Recent developments in sequencing technologies have provided newer insights into the same...
binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets
binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets
Abstract
The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial comm...
Genomic characterization of the
C. tuberculostearicum
species complex, a ubiquitous member of the human skin microbiome
Genomic characterization of the
C. tuberculostearicum
species complex, a ubiquitous member of the human skin microbiome
ABSTRACT
Corynebacterium
is a predominant genus in the skin microbiome, yet its genetic diversity on skin is incompletely chara...
Genomics and society: four scenarios for 2015
Genomics and society: four scenarios for 2015
This paper develops four alternative scenarios depicting possible futures for genomics applications within a broader social context. The scenarios integrate forecasts for future ge...
How chromosomal rearrangements shape genomes : a computational and mathematical study
How chromosomal rearrangements shape genomes : a computational and mathematical study
Comment les réarrangements chromosomiques façonnent les génomes : étude par modélisation et simulations
Les origines de la complexité des génomes, ainsi que les dét...
Statistique des comparaisons de génomes complets bactériens
Statistique des comparaisons de génomes complets bactériens
La génomique comparative est l'étude des relations structurales et fonctionnelles entre des génomes appartenant à différentes souches ou espèces. Cette discipline offre ainsi la po...
Unlocking Nature's Code: The Power of Pan-Genomics in Biological Entities
Unlocking Nature's Code: The Power of Pan-Genomics in Biological Entities
Pan-genomics, a holistic approach to genomic analysis, has become a powerful tool with diverse applications in biomedical and environmental sciences. This offers a valuable unders...
Analysis of space-based observations of peroxyacetyl nitrate (PAN) and its relation to other atmospheric tracers
Analysis of space-based observations of peroxyacetyl nitrate (PAN) and its relation to other atmospheric tracers
<p>Peroxyacetyl nitrate (CH<sub>3</sub>C(O)O<sub>2</sub>NO<sub>2</sub>; abbreviate...

