Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Improved taxonomic annotation of Archaea communities using LotuS2, the Genome Taxonomy Database and RNAseq data

View through CrossRef
Abstract Metabarcoding is increasingly used to uncover diversity and characterise communities of Archaea In various habitats, but taxonomic annotation of their sequences remains more challenging than for bacteria. Fewer reference sequences are available; widely used databases do not reflect recent revisions of higher level archaeal taxonomy and a substantial fraction of their phylogenetic diversity remains to be fully characterised. We address these gaps with a systematic and tractable approach based around the Genome Taxonomy Database (GTDB). GTDB provides a standardized taxonomy with normalized ranks based on protein coding genes, allowing us to identify and remove incongruent SSU sequences. We then use this in combination with the eukaryote PR2 database to annotate a collection of near full length rRNA sequences and the Archaea SSU sequences in SILVA, creating a new reference database, KSGP ( K arst, S ilva, G TDB and P R2). GTDB SSUs alone provides a small improvement in annotation of an example marine Archaea OTU data set over standardized SSU databases such as SILVA and Greengenes2, while KSGP increases Class and Order assignments by 145% and 280% respectively and is likely to provide some improvement in annotation of bacterial sequences too. We make the KSGP database and a cleaned and deduplicated subset of GTDB SSU sequences available at ksgp.earlham.ac.uk; integrate them into a metabarcoding pipeline, LotuS2 and outline rapid and robust strategies to generate a set of annotated Archaea OTUs and to determine the proportion of Archaea sequences in metatranscriptomic data. We also demonstrate simple tools to visualise the completeness of database coverage and outline strategies to further understand poorly characterised components of the archaeal community which will be equally applicable to bacteria.
Title: Improved taxonomic annotation of Archaea communities using LotuS2, the Genome Taxonomy Database and RNAseq data
Description:
Abstract Metabarcoding is increasingly used to uncover diversity and characterise communities of Archaea In various habitats, but taxonomic annotation of their sequences remains more challenging than for bacteria.
Fewer reference sequences are available; widely used databases do not reflect recent revisions of higher level archaeal taxonomy and a substantial fraction of their phylogenetic diversity remains to be fully characterised.
We address these gaps with a systematic and tractable approach based around the Genome Taxonomy Database (GTDB).
GTDB provides a standardized taxonomy with normalized ranks based on protein coding genes, allowing us to identify and remove incongruent SSU sequences.
We then use this in combination with the eukaryote PR2 database to annotate a collection of near full length rRNA sequences and the Archaea SSU sequences in SILVA, creating a new reference database, KSGP ( K arst, S ilva, G TDB and P R2).
GTDB SSUs alone provides a small improvement in annotation of an example marine Archaea OTU data set over standardized SSU databases such as SILVA and Greengenes2, while KSGP increases Class and Order assignments by 145% and 280% respectively and is likely to provide some improvement in annotation of bacterial sequences too.
We make the KSGP database and a cleaned and deduplicated subset of GTDB SSU sequences available at ksgp.
earlham.
ac.
uk; integrate them into a metabarcoding pipeline, LotuS2 and outline rapid and robust strategies to generate a set of annotated Archaea OTUs and to determine the proportion of Archaea sequences in metatranscriptomic data.
We also demonstrate simple tools to visualise the completeness of database coverage and outline strategies to further understand poorly characterised components of the archaeal community which will be equally applicable to bacteria.

Related Results

KSGP 3.1: improved taxonomic annotation of Archaea communities using LotuS2, the genome taxonomy database and RNAseq data
KSGP 3.1: improved taxonomic annotation of Archaea communities using LotuS2, the genome taxonomy database and RNAseq data
Abstract Taxonomic annotation is a substantial challenge for Archaea metabarcoding. A limited number of reference sequences are available; a substantial fraction ...
Archaea
Archaea
AbstractAnalysis of nucleotide sequences of ribosomalribonucleic acid(RNA) led in the 1970s to the recognition of the existence of three domains of life, named Eukarya (Eukaryotes)...
Burden of the Beast
Burden of the Beast
Introduction Throughout the COVID-19 pandemic, and its fluctuating waves of infections and the emergence of new variants, Indigenous populations in Australia and worldwide have re...
An extensible genome annotation workbench based on the Galaxy Platform
An extensible genome annotation workbench based on the Galaxy Platform
Introduction Falling costs of genetic sequencing have allowed sequencing and annotation of the genomes of non-model organism. In annotating non-mod...
GEOSPATIAL ASPECTS OF FINANCIAL CAPACITY OF TERRITORIAL COMMUNITIES OF TERNOPIL REGION
GEOSPATIAL ASPECTS OF FINANCIAL CAPACITY OF TERRITORIAL COMMUNITIES OF TERNOPIL REGION
In the article geospatial aspects of the financial capacity of territorial communities of Ternopil region are described. The need to conduct such a study has been updated, since no...
Systematic genome-guided discovery of antagonistic interactions between archaea and bacteria
Systematic genome-guided discovery of antagonistic interactions between archaea and bacteria
ABSTRACT The social life of archaea is poorly understood. In particular, even though competition and conflict are common themes in microbial comm...
Benchmarking Hayai-Annotation Plants: A Re-evaluation Using Standard Evaluation Metrics
Benchmarking Hayai-Annotation Plants: A Re-evaluation Using Standard Evaluation Metrics
Abstract The rapid growth of next-generation sequencing (NGS) technology has led to a surge in the determination of whole genome sequences in pla...
Galaxy Genome Annotation: Galaxy as a platform for the annotation of genomes
Galaxy Genome Annotation: Galaxy as a platform for the annotation of genomes
Galaxy Genome Annotation (GGA) is a project focusing on developments and resources to turn Galaxy into a complete and efficient platform for the structural and functional annotatio...

Back to Top