Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

KSGP 3.1: improved taxonomic annotation of Archaea communities using LotuS2, the genome taxonomy database and RNAseq data

View through CrossRef
Abstract Taxonomic annotation is a substantial challenge for Archaea metabarcoding. A limited number of reference sequences are available; a substantial fraction of phylogenetic diversity is not fully characterized; widely used databases do not reflect current archaeal taxonomy and contain mislabelled sequences. We address these gaps with a systematic and tractable approach based around the Genome Taxonomy Database (GTDB) combined with the eukaryote PR2 and MIDORI mitochondrial databases. After removing incongruent, chimeric and duplicate SSU sequences, this combination (GTDB+) provides a small improvement in annotation of a set of estuarine Archaea Operational Taxonomic Units (OTUs) compared to SILVA. We add to this a collection of near full length rRNA sequences and the prokaryote SSU sequences in SILVA, creating a new reference database, KSGP (Karst, Silva, GTDB, and PR2). The additional sequences are (re-)annotated using three different approaches. The most conservative, using lowest common ancestor, gives a further small improvement. Annotation using SINTAX increases Class and Order assignments by 2.7 and 4.2 times over SILVA, although this may include some “lumping” of un-named and named clades. Still further improvement can be made using similarity based clustering to group database sequences into putative taxa at all taxonomic levels, assigning 60% and 41% of Archaea OTUs to putative family and genus level taxa respectively. GTDB without cleaning and GreenGenes2 both perform poorly and cannot be recommended for use with Archaea. We make the GTDB+ and KSGP databases available at ksgp.earlham.ac.uk; integrate them into a metabarcoding pipeline, LotuS2 and outline their use to annotate Archaea OTUs and metatranscriptomic data.
Title: KSGP 3.1: improved taxonomic annotation of Archaea communities using LotuS2, the genome taxonomy database and RNAseq data
Description:
Abstract Taxonomic annotation is a substantial challenge for Archaea metabarcoding.
A limited number of reference sequences are available; a substantial fraction of phylogenetic diversity is not fully characterized; widely used databases do not reflect current archaeal taxonomy and contain mislabelled sequences.
We address these gaps with a systematic and tractable approach based around the Genome Taxonomy Database (GTDB) combined with the eukaryote PR2 and MIDORI mitochondrial databases.
After removing incongruent, chimeric and duplicate SSU sequences, this combination (GTDB+) provides a small improvement in annotation of a set of estuarine Archaea Operational Taxonomic Units (OTUs) compared to SILVA.
We add to this a collection of near full length rRNA sequences and the prokaryote SSU sequences in SILVA, creating a new reference database, KSGP (Karst, Silva, GTDB, and PR2).
The additional sequences are (re-)annotated using three different approaches.
The most conservative, using lowest common ancestor, gives a further small improvement.
Annotation using SINTAX increases Class and Order assignments by 2.
7 and 4.
2 times over SILVA, although this may include some “lumping” of un-named and named clades.
Still further improvement can be made using similarity based clustering to group database sequences into putative taxa at all taxonomic levels, assigning 60% and 41% of Archaea OTUs to putative family and genus level taxa respectively.
GTDB without cleaning and GreenGenes2 both perform poorly and cannot be recommended for use with Archaea.
We make the GTDB+ and KSGP databases available at ksgp.
earlham.
ac.
uk; integrate them into a metabarcoding pipeline, LotuS2 and outline their use to annotate Archaea OTUs and metatranscriptomic data.

Related Results

Improved taxonomic annotation of Archaea communities using LotuS2, the Genome Taxonomy Database and RNAseq data
Improved taxonomic annotation of Archaea communities using LotuS2, the Genome Taxonomy Database and RNAseq data
Abstract Metabarcoding is increasingly used to uncover diversity and characterise communities of Archaea In various habitats, but taxonomic annot...
Archaea
Archaea
AbstractAnalysis of nucleotide sequences of ribosomalribonucleic acid(RNA) led in the 1970s to the recognition of the existence of three domains of life, named Eukarya (Eukaryotes)...
GEOSPATIAL ASPECTS OF FINANCIAL CAPACITY OF TERRITORIAL COMMUNITIES OF TERNOPIL REGION
GEOSPATIAL ASPECTS OF FINANCIAL CAPACITY OF TERRITORIAL COMMUNITIES OF TERNOPIL REGION
In the article geospatial aspects of the financial capacity of territorial communities of Ternopil region are described. The need to conduct such a study has been updated, since no...
Systematic genome-guided discovery of antagonistic interactions between archaea and bacteria
Systematic genome-guided discovery of antagonistic interactions between archaea and bacteria
ABSTRACTThe social life of archaea is poorly understood. In particular, even though competition and conflict are common themes in microbial communities, there is scant evidence doc...
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Abstarct Introduction Isolated brain hydatid disease (BHD) is an extremely rare form of echinococcosis. A prompt and timely diagnosis is a crucial step in disease management. This ...
Towards a Taxonomy of Systemic Risks
Towards a Taxonomy of Systemic Risks
Systemic risks, emerging from dynamic interactions among natural, technological, and societal systems, pose multifaceted challenges to modern, interconnected societies. These risks...
Are publications on zoological taxonomy under attack?
Are publications on zoological taxonomy under attack?
Taxonomy is essential to biological sciences and the priority field to be supported in face of the biodiversity crisis. The industry of scientific publications has made extensive u...
Applying negative rule mining to improve genome annotation
Applying negative rule mining to improve genome annotation
Abstract Background Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurious functional assignments...

Back to Top