Javascript must be enabled to continue!
Bacteria are everywhere, even in your COI marker gene data!
View through CrossRef
Abstract
The mitochondrial cytochrome C oxidase subunit I gene (COI) is commonly used in eDNA metabarcoding studies, especially for assessing metazoan diversity. Yet, a great number of COI operational taxonomic units or/and amplicon sequence variants are retrieved from such studies and referred to as “dark matter”, and do not get a taxonomic assignment with a reference sequence. For a thorough investigation of this dark matter, we have developed the Dark mAtteR iNvestigator (DARN) software tool. A reference COI-oriented phylogenetic tree was built from 1,240 consensus sequences covering all the three domains of life, with more than 80% of those representing eukaryotic taxa. With respect to eukaryotes, consensus sequences at the family level were constructed from 183,330 retrieved from the Midori reference 2 database. Similarly, sequences from 559 bacterial genera and 41 archaeal were retrieved from the BOLD database. DARN makes use of the phylogenetic tree to investigate and quantify pre-processed sequences of amplicon samples to provide both a tabular and a graphical overview of phylogenetic assignments. To evaluate DARN, both environmental and bulk metabarcoding samples from different aquatic environments using various primer sets were analysed. We demonstrate that a large proportion of non-target prokaryotic organisms such as bacteria and archaea are also amplified in eDNA samples and we suggest bacterial COI sequences to be included in the reference databases used for the taxonomy assignment to allow for further analyses of dark matter. DARN source code is available on GitHub at
https://github.com/hariszaf/darn
and you may find it as a Docker at
https://hub.docker.com/r/hariszaf/darn
.
Author summary
DARN is a software approach aiming to provide further insight in the COI amplicon data coming from environmental samples. Building a COI-oriented reference phylogeny tree is a challenging task especially considering the small number of microbial curated COI sequences deposited in reference databases; e.g ~4,000 bacterial and ~150 archaeal in BOLD. Apparently, as more and more such sequences are collated, the DARN approach improves. To provide a more interactive way of communicating both our approach and our results, we strongly suggest the reader to visit this
Google Collab notebook
where all steps are described step by step and also this
GitHub page
where our results are demonstrated. Our approach corroborates the known presence of microbial sequences in COI environmental sequencing samples and highlights the need for curated bacterial and archaeal COI sequences and their integration into reference databases (i.e. Midori, BOLD, etc). We argue that DARN will benefit researchers as a quality control tool for their sequenced samples in terms of distinguishing eukaryotic from non-eukaryotic OTUs/ASVs, but also in terms of understanding the unknown unknowns.
Title: Bacteria are everywhere, even in your COI marker gene data!
Description:
Abstract
The mitochondrial cytochrome C oxidase subunit I gene (COI) is commonly used in eDNA metabarcoding studies, especially for assessing metazoan diversity.
Yet, a great number of COI operational taxonomic units or/and amplicon sequence variants are retrieved from such studies and referred to as “dark matter”, and do not get a taxonomic assignment with a reference sequence.
For a thorough investigation of this dark matter, we have developed the Dark mAtteR iNvestigator (DARN) software tool.
A reference COI-oriented phylogenetic tree was built from 1,240 consensus sequences covering all the three domains of life, with more than 80% of those representing eukaryotic taxa.
With respect to eukaryotes, consensus sequences at the family level were constructed from 183,330 retrieved from the Midori reference 2 database.
Similarly, sequences from 559 bacterial genera and 41 archaeal were retrieved from the BOLD database.
DARN makes use of the phylogenetic tree to investigate and quantify pre-processed sequences of amplicon samples to provide both a tabular and a graphical overview of phylogenetic assignments.
To evaluate DARN, both environmental and bulk metabarcoding samples from different aquatic environments using various primer sets were analysed.
We demonstrate that a large proportion of non-target prokaryotic organisms such as bacteria and archaea are also amplified in eDNA samples and we suggest bacterial COI sequences to be included in the reference databases used for the taxonomy assignment to allow for further analyses of dark matter.
DARN source code is available on GitHub at
https://github.
com/hariszaf/darn
and you may find it as a Docker at
https://hub.
docker.
com/r/hariszaf/darn
.
Author summary
DARN is a software approach aiming to provide further insight in the COI amplicon data coming from environmental samples.
Building a COI-oriented reference phylogeny tree is a challenging task especially considering the small number of microbial curated COI sequences deposited in reference databases; e.
g ~4,000 bacterial and ~150 archaeal in BOLD.
Apparently, as more and more such sequences are collated, the DARN approach improves.
To provide a more interactive way of communicating both our approach and our results, we strongly suggest the reader to visit this
Google Collab notebook
where all steps are described step by step and also this
GitHub page
where our results are demonstrated.
Our approach corroborates the known presence of microbial sequences in COI environmental sequencing samples and highlights the need for curated bacterial and archaeal COI sequences and their integration into reference databases (i.
e.
Midori, BOLD, etc).
We argue that DARN will benefit researchers as a quality control tool for their sequenced samples in terms of distinguishing eukaryotic from non-eukaryotic OTUs/ASVs, but also in terms of understanding the unknown unknowns.
Related Results
Characteristics and conflicts of interest at Food and Drug Administration Gastrointestinal Drug Advisory Committee meetings
Characteristics and conflicts of interest at Food and Drug Administration Gastrointestinal Drug Advisory Committee meetings
Introduction
The United States Food and Drug Administration (FDA) Gastrointestinal Drug Advisory Committee (GIDAC) is involved in gastrointestinal drug application reviews. Charact...
Assessment of Conflicts of Interest in Robotic Surgical Studies
Assessment of Conflicts of Interest in Robotic Surgical Studies
Background:
Accurate conflict of interest (COI) statements are important, as a known COI may invalidate study results due to the potential risk of bias.
...
Italian Bird Rarities Committee (COI) - Report 29
Italian Bird Rarities Committee (COI) - Report 29
Italian Birds Rarities Committee (COI) - Report 29. This report refers to records from January 1st to December 31st 2019, with the addition of a number of records from previous yea...
Commissione Ornitologica Italiana (COI) - Report 28
Commissione Ornitologica Italiana (COI) - Report 28
Italian Ornithological Commission (COI) - Report 28. This report refers to records from 2018, with the addition of a number of records from previous years which were submitted more...
Expression and polymorphism of genes in gallstones
Expression and polymorphism of genes in gallstones
ABSTRACT
Through the method of clinical case control study, to explore the expression and genetic polymorphism of KLF14 gene (rs4731702 and rs972283) and SR-B1 gene...
TƯƠNG QUAN MẶT PHẲNG TẬN CÙNG RĂNG CỐI SỮA THỨ HAI DẠNG THẲNG HAY DẠNG BẬC XUỐNG GẦN LÀ TƯƠNG QUAN LÝ TƯỞNG Ở BỘ RĂNG SỮA?
TƯƠNG QUAN MẶT PHẲNG TẬN CÙNG RĂNG CỐI SỮA THỨ HAI DẠNG THẲNG HAY DẠNG BẬC XUỐNG GẦN LÀ TƯƠNG QUAN LÝ TƯỞNG Ở BỘ RĂNG SỮA?
Đặt vấn đề: Ở bộ răng sữa, có ba dạng tương quan mặt phẳng tận cùng răng cối sữa thứ hai là dạng thẳng, dạng bậc xuống gần và dạng bậc xuống xa. Trong đó, tương quan dạng thẳng đượ...
Effect of Gram-positive bacteria on antibiotic resistance in Gram-negative bacteria
Effect of Gram-positive bacteria on antibiotic resistance in Gram-negative bacteria
Antibiotics are one of the most common treatments for bacterial infections, but the emergence of antibiotic resistance is a major threat to the control of infectious diseases. Many...
Variasi Sekuens Gen COI untuk DNA Barcoding Ikan Tuna
Variasi Sekuens Gen COI untuk DNA Barcoding Ikan Tuna
DNA barcoding has been used for species identification of fishes, especially for fish product authentication. In tuna fish food products authentication, DNA barcoding is needed due...

