Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Critical assessment of pan-genomics of metagenome-assembled genomes

View through CrossRef
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years. As a result, millions of MAGs have been produced and increasingly included in pan-genomics workflow. However, pan-genome analyses of MAGs may suffer from the known issues with MAGs: fragmentation, incompleteness, and contamination, due to mis-assembly and mis-binning. Here, we conducted a critical assessment of including MAGs in pan-genome analysis, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs.ResultsWe found that incompleteness led to more significant core gene loss than fragmentation. Contamination had little effect on core genome size but had major influence on accessory genomes. The core gene loss remained when using different pan-genome analysis tools and when using a mixture of MAGs and complete genomes. Importantly, the core gene loss was partially alleviated by lowering the core gene threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The core gene loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees.ConclusionsWe conclude that lowering core gene threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs to alleviate the accuracy loss. Better quality control of MAGs and development of new pan-genome analysis tools specifically designed for MAGs are needed in future studies.
Cold Spring Harbor Laboratory
Title: Critical assessment of pan-genomics of metagenome-assembled genomes
Description:
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years.
As a result, millions of MAGs have been produced and increasingly included in pan-genomics workflow.
However, pan-genome analyses of MAGs may suffer from the known issues with MAGs: fragmentation, incompleteness, and contamination, due to mis-assembly and mis-binning.
Here, we conducted a critical assessment of including MAGs in pan-genome analysis, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs.
ResultsWe found that incompleteness led to more significant core gene loss than fragmentation.
Contamination had little effect on core genome size but had major influence on accessory genomes.
The core gene loss remained when using different pan-genome analysis tools and when using a mixture of MAGs and complete genomes.
Importantly, the core gene loss was partially alleviated by lowering the core gene threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%.
The core gene loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees.
ConclusionsWe conclude that lowering core gene threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs to alleviate the accuracy loss.
Better quality control of MAGs and development of new pan-genome analysis tools specifically designed for MAGs are needed in future studies.

Related Results

binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets
binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets
Abstract The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial comm...
Genomics and society: four scenarios for 2015
Genomics and society: four scenarios for 2015
This paper develops four alternative scenarios depicting possible futures for genomics applications within a broader social context. The scenarios integrate forecasts for future ge...
Unlocking Nature's Code: The Power of Pan-Genomics in Biological Entities
Unlocking Nature's Code: The Power of Pan-Genomics in Biological Entities
Pan-genomics, a holistic approach to genomic analysis, has become a powerful tool with diverse applications in biomedical and environmental sciences. This offers a valuable unders...
How chromosomal rearrangements shape genomes : a computational and mathematical study
How chromosomal rearrangements shape genomes : a computational and mathematical study
Comment les réarrangements chromosomiques façonnent les génomes : étude par modélisation et simulations Les origines de la complexité des génomes, ainsi que les dét...
Statistique des comparaisons de génomes complets bactériens
Statistique des comparaisons de génomes complets bactériens
La génomique comparative est l'étude des relations structurales et fonctionnelles entre des génomes appartenant à différentes souches ou espèces. Cette discipline offre ainsi la po...
Pan-genome analysis of six Paracoccus type strain genomes reveal lifestyle traits
Pan-genome analysis of six Paracoccus type strain genomes reveal lifestyle traits
The genus Paracoccus capable of inhabiting a variety of different ecological niches both, marine and terrestrial, is globally distributed. In addition, Paracoccus is taxonomically,...

Back to Top