Javascript must be enabled to continue!

Comparison of three clustering approaches for detecting novel environmental microbial diversity

Discovery of novel diversity in high-throughput sequencing (HTS) studies is a central task in environmental microbial ecology. To evaluate the effects that amplicon clustering methods have on novel diversity discovery, we clustered an environmental marine protist HTS dataset of protist reads together with accessions from the taxonomically curated PR 2 reference database using three de novo approaches: sequence similarity networks, USEARCH, and Swarm. The novel diversity uncovered by each clustering approach differed drastically in the number of operational taxonomic units (OTUs) and the number of environmental amplicons in these novel diversity OTUs. Global pairwise alignment comparisons revealed that numerous amplicons classified as novel by USEARCH and Swarm were actually highly similar to reference accessions. Using graph theory we found additional novel diversity within OTUs that would have gone unnoticed without further using their underlying network topologies. Our results suggest that novel diversity inferred from clustering approaches requires further validation, whereas graph theory provides a powerful tool for microbial ecology and the analyses of environmental HTS datasets.

PeerJ

Dominik Forster Micah Dunthorn Thorsten Stoeck Frédéric Mahé

2018

Title: Comparison of three clustering approaches for detecting novel environmental microbial diversity

Description:

Discovery of novel diversity in high-throughput sequencing (HTS) studies is a central task in environmental microbial ecology.

To evaluate the effects that amplicon clustering methods have on novel diversity discovery, we clustered an environmental marine protist HTS dataset of protist reads together with accessions from the taxonomically curated PR 2 reference database using three de novo approaches: sequence similarity networks, USEARCH, and Swarm.

The novel diversity uncovered by each clustering approach differed drastically in the number of operational taxonomic units (OTUs) and the number of environmental amplicons in these novel diversity OTUs.

Global pairwise alignment comparisons revealed that numerous amplicons classified as novel by USEARCH and Swarm were actually highly similar to reference accessions.

Using graph theory we found additional novel diversity within OTUs that would have gone unnoticed without further using their underlying network topologies.

Our results suggest that novel diversity inferred from clustering approaches requires further validation, whereas graph theory provides a powerful tool for microbial ecology and the analyses of environmental HTS datasets.

Back

Related Results

The Kernel Rough K-Means Algorithm

Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in re...

Image clustering using exponential discriminant analysis

Local learning based image clustering models are usually employed to deal with images sampled from the non‐linear manifold. Recently, linear discriminant analysis (LDA) based vario...

Diversity analysis of soil microbial population abundance before and after planting JunCao “Oasis No. 1” in saline-alkali soil

Abstract In order to explore the difference of soil microbial population structure and abundance before and after planting JunCao “Oasis No. 1” in saline-alkali soi...

Optimizing machine learning techniques for genomics clustering

Optimisation des techniques d’apprentissage automatique pour le clustering génomique Dans le domaine de la bioinformatique, le clustering est une technique efficace...

Cash‐based approaches in humanitarian emergencies: a systematic review

This Campbell systematic review examines the effectiveness, efficiency and implementation of cash transfers in humanitarian settings. The review summarises evidence from five studi...

Efektivitas Penerapan Teknik Clustering Terhadap Keterampilan Menulis Puisi Bebas Siswa Sekolah Dasar Gugus IV Kecamatan Biringkanaya Kota Makassar

Penelitian ini bertujuan untuk mendeskripsikan keefektifan penerapan teknik Clustering, mengetahui gambaran keterampilan menulis puisi bebas siswa, menguji keefektifan penerapan te...

PERBANDINGAN ALGORITMA K-MEANS, K-MEDOID, DAN DBSCAN UNTUK CLUSTERING KUALITAS HIDUP INDONESIA DALAM PERSPEKTIF KNOWLEDGE MANAGEMENT DAN DATA DISCOVERY

Kemajuan era digital mendunia memaksa manusia harus semakin peka dalam menggunakan teknologi dalam setiap aspek kehidupan. Khususnya pergerakan kualitas hidup di Indonesia, tantang...

A Proposed Clustering Algorithm for Efficient Clustering of High-Dimensional Data

To partition transaction data values, clustering algorithms are used. To analyse the relationships between transactions, similarity measures are utilized. Similarity models based o...

Email:
Password:

Email: