Javascript must be enabled to continue!
VTAM: A robust pipeline for validating metabarcoding data using optimized parameters based on internal controls
View through CrossRef
Metabarcoding has become a powerful approach to study biodiversity from environmental samples but it is still prone to some pitfalls. Several papers have called for good practice in study design, data production and analyses to ensure repeatability and comparability between studies. Notably, the importance of mock community samples, negative controls, and replicates is frequently highlighted (Alberdi et al. 2018, O'Rourke et al. 2020). However, their use in bioinformatics pipelines is often limited to post hoc verification of expectations by the user. Indeed, one of the biggest challenges in metabarcoding analyses is to take into account the trade-off between false positive (FP) and false negative (FN) occurrences. We thus developed the VTAM (Validation and Taxonomic Assignation of Metabarcoding data) pipeline, which is the first tool to use explicitly the negative control and mock samples to find optimal parameters to minimize false positive and negative occurrences. In addition, VTAM addresses all known technical error types including tag-jumps, repeatability among replicates, and also it is able to integrate more than one overlapping markers to further minimize false negative occurrences.
In order to evaluate VTAM, we compared it with two other pipelines: a pipeline based on DADA2 (Callahan et al. 2016) and LULU (Frøslev et al. 2017), and a pipeline based on OBITools3 (Boyer et al. 2016) and metabaR (Zinger et al. 2020). Two datasets from fish and bat diet studies were analysed with the three different pipelines. Based on mock and negative samples, we demonstrate that VTAM showed the best precision for mock samples in both datasets, while specificity in negative controls were comparable among the three pipelines (Fig. 1).
VTAM therefore constitutes a complete pipeline to filter and validate metabarcoding data, from raw FASTQ data to Amplicon Sequence Variant tables with taxonomic assignments. Our pipeline aggregates a series of features rarely grouped in a single pipeline and performs a non-arbitrary parameter optimization based on internal control samples to generate conservative but informative metabarcoding datasets. We believe VTAM provides a very valuable tool for the validation of metabarcoding data, which is essential for conducting robust analyses of biodiversity.
Pensoft Publishers
Title: VTAM: A robust pipeline for validating metabarcoding data using optimized parameters based on internal controls
Description:
Metabarcoding has become a powerful approach to study biodiversity from environmental samples but it is still prone to some pitfalls.
Several papers have called for good practice in study design, data production and analyses to ensure repeatability and comparability between studies.
Notably, the importance of mock community samples, negative controls, and replicates is frequently highlighted (Alberdi et al.
2018, O'Rourke et al.
2020).
However, their use in bioinformatics pipelines is often limited to post hoc verification of expectations by the user.
Indeed, one of the biggest challenges in metabarcoding analyses is to take into account the trade-off between false positive (FP) and false negative (FN) occurrences.
We thus developed the VTAM (Validation and Taxonomic Assignation of Metabarcoding data) pipeline, which is the first tool to use explicitly the negative control and mock samples to find optimal parameters to minimize false positive and negative occurrences.
In addition, VTAM addresses all known technical error types including tag-jumps, repeatability among replicates, and also it is able to integrate more than one overlapping markers to further minimize false negative occurrences.
In order to evaluate VTAM, we compared it with two other pipelines: a pipeline based on DADA2 (Callahan et al.
2016) and LULU (Frøslev et al.
2017), and a pipeline based on OBITools3 (Boyer et al.
2016) and metabaR (Zinger et al.
2020).
Two datasets from fish and bat diet studies were analysed with the three different pipelines.
Based on mock and negative samples, we demonstrate that VTAM showed the best precision for mock samples in both datasets, while specificity in negative controls were comparable among the three pipelines (Fig.
1).
VTAM therefore constitutes a complete pipeline to filter and validate metabarcoding data, from raw FASTQ data to Amplicon Sequence Variant tables with taxonomic assignments.
Our pipeline aggregates a series of features rarely grouped in a single pipeline and performs a non-arbitrary parameter optimization based on internal control samples to generate conservative but informative metabarcoding datasets.
We believe VTAM provides a very valuable tool for the validation of metabarcoding data, which is essential for conducting robust analyses of biodiversity.
Related Results
VTAM: A robust pipeline for validating metabarcoding data using internal controls
VTAM: A robust pipeline for validating metabarcoding data using internal controls
Abstract
Metabarcoding studies should be carefully designed to minimize false positives and false neg...
Installation Analysis of Matterhorn Pipeline Replacement
Installation Analysis of Matterhorn Pipeline Replacement
Abstract
The paper describes the installation analysis for the Matterhorn field pipeline replacement, located in water depths between 800-ft to 1200-ft in the Gul...
A Fluid-pipe-soil Approach to Stability Design of Submarine Pipelines
A Fluid-pipe-soil Approach to Stability Design of Submarine Pipelines
Abstract
The conventional approach to submarine pipeline stability design considers interactions between water and pipeline (fluid-pipe) and pipeline and seabed (...
Pipeline Resistance
Pipeline Resistance
Pipeline resistance is where an often abstract and wonky climate movement meets the bravery and boldness of Indigenous and other frontline defenders of land and water who inspire d...
PEMA v2: addressing metabarcoding bioinformatics analysis challenges
PEMA v2: addressing metabarcoding bioinformatics analysis challenges
Environmental DNA (eDNA) and metabarcoding have launched a new era in bio- and eco-assessment over the last years (Ruppert et al. 2019). The simultaneous identification, at the low...
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
Human tissues comprise trillions of cells that populate a complex space of molecular phenotypes and functions and that vary in abundance by 4–9 orders of magnitude. Relying solely ...
Seismic Vulnerability of the Subsea Pipeline
Seismic Vulnerability of the Subsea Pipeline
Abstract
Unburied marine pipeline vulnerability under seismic impact, a new approach of investigation, and conclusion / recommendations for certain analyzed cases...
Modified FMEA quality risk management technique for cross-country petroleum pipeline using GIS
Modified FMEA quality risk management technique for cross-country petroleum pipeline using GIS
PurposeThe aim of this study is to employ the failure mode and effect analysis (FMEA) for risk management in cross-country pipelines, coupled with the utilization of Geographic Inf...

