Javascript must be enabled to continue!
Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
View through CrossRef
Abstract
Motivation
The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly.
Results
S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains—a typical scenario—S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size.
Availability and implementation
S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Oxford University Press (OUP)
Title: Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
Description:
Abstract
Motivation
The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly.
Results
S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling.
It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads.
S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads.
Its implementation is highly generic and can be applied to any arbitrary type of annotation.
On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest.
When studying a few dozens of functional domains—a typical scenario—S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time.
S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size.
Availability and implementation
S3A is available at http://www.
lcqb.
upmc.
fr/S3A_ASSEMBLER/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Related Results
CAIM: Coverage-based Analysis for Identification of Microbiome
CAIM: Coverage-based Analysis for Identification of Microbiome
ABSTRACTAccurate taxonomic profiling of microbial taxa in a metagenomic sample is vital to gain insights into microbial ecology. Recent advancements in sequencing technologies have...
Metagenomic Thermometer
Metagenomic Thermometer
AbstractVarious microorganisms exist in environments, and each of which has an optimal growth temperature (OGT). The relationship between genomic information and OGT of each specie...
LMAS: evaluating metagenomic short de novo assembly methods through defined communities
LMAS: evaluating metagenomic short de novo assembly methods through defined communities
Abstract
Background
The de novo assembly of raw sequence data is key in metagenomic analysis. It allows recovering draft genomes...
Metagenomic Thermometer
Metagenomic Thermometer
Abstract
Various microorganisms exist in environments, and each of them has its optimal growth temperature (OGT). The relationship between genomic information and OG...
CoCoBin: Graph-Based Metagenomic Binning via Composition–Coverage Separation
CoCoBin: Graph-Based Metagenomic Binning via Composition–Coverage Separation
Abstract
Motivation
Metagenomic binning is a critical step in metagenomic analysis, aiming to cluster contigs from the same genome into c...
Optimising primary molecular profiling in NSCLC
Optimising primary molecular profiling in NSCLC
AbstractIntroductionMolecular profiling of NSCLC is essential for optimising treatment decisions, but often incomplete. We assessed the efficacy of protocolised molecular profiling...
Simple, reference-independent analyses help optimize hybrid assembly of microbial community metagenomes
Simple, reference-independent analyses help optimize hybrid assembly of microbial community metagenomes
AbstractHybrid metagenomic assembly, leveraging both long- and short-read sequencing technologies, of microbial communities is becoming an increasingly accessible approach, yet its...
CAIM: coverage-based analysis for identification of microbiome
CAIM: coverage-based analysis for identification of microbiome
Abstract
Accurate taxonomic profiling of microbial taxa in a metagenomic sample is vital to gain insights into microbial ecology. Recent advancements in sequencing t...

