Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

The importance of study design for detecting differentially abundant features in high-throughput experiments

View through CrossRef
ABSTRACTThe use of high-throughput experiments, such as RNA-seq, to simultaneously identify differentially abundant entities across conditions has become widespread, but the systematic planning of such studies is currently hampered by the lack of general-purpose tools to do so. Here we demonstrate that there is substantial variability in performance across statistical tests, normalization techniques and study conditions, potentially leading to significant wastage of resources and/or missing information in the absence of careful study design. We present a broadly applicable experimental design tool called EDDA, and the first for single-cell RNA-seq, Nanostring and Metagenomic studies, that can be used to i) rationally choose from a panel of statistical tests, ii) measure expected performance for a study and iii) plan experiments to minimize mis-utilization of valuable resources. Using case studies from recent single-cell RNA-seq, Nanostring and Metagenomics studies, we highlight its general utility and, in particular, show a) the ability to correctly model single-cell RNA-seq data and do comparisons with 1/5ththe amount of sequencing currently used and b) that the selection of suitable statistical tests strongly impacts the ability to detect biomarkers in Metagenomic studies. Furthermore, we demonstrate that a novel mode-based normalization employed in EDDA uniformly improves in robustness over existing approaches (10-20%) and increases precision to detect differential abundance by up to 140%.
Title: The importance of study design for detecting differentially abundant features in high-throughput experiments
Description:
ABSTRACTThe use of high-throughput experiments, such as RNA-seq, to simultaneously identify differentially abundant entities across conditions has become widespread, but the systematic planning of such studies is currently hampered by the lack of general-purpose tools to do so.
Here we demonstrate that there is substantial variability in performance across statistical tests, normalization techniques and study conditions, potentially leading to significant wastage of resources and/or missing information in the absence of careful study design.
We present a broadly applicable experimental design tool called EDDA, and the first for single-cell RNA-seq, Nanostring and Metagenomic studies, that can be used to i) rationally choose from a panel of statistical tests, ii) measure expected performance for a study and iii) plan experiments to minimize mis-utilization of valuable resources.
Using case studies from recent single-cell RNA-seq, Nanostring and Metagenomics studies, we highlight its general utility and, in particular, show a) the ability to correctly model single-cell RNA-seq data and do comparisons with 1/5ththe amount of sequencing currently used and b) that the selection of suitable statistical tests strongly impacts the ability to detect biomarkers in Metagenomic studies.
Furthermore, we demonstrate that a novel mode-based normalization employed in EDDA uniformly improves in robustness over existing approaches (10-20%) and increases precision to detect differential abundance by up to 140%.

Related Results

Untargeted metabolomics of the intestinal tract of DEV-infected ducks
Untargeted metabolomics of the intestinal tract of DEV-infected ducks
Abstract Introduction Duck enteritis virus (DEV) mainly causes infectious diseases characterized by intestinal haemorrhage, inflammation and parench...
Design
Design
Conventional definitions of design rarely capture its reach into our everyday lives. The Design Council, for example, estimates that more than 2.5 million people use design-related...
An integrated model for process parameter adjustment to recover throughput shortage in semiconductor assembly: A case study
An integrated model for process parameter adjustment to recover throughput shortage in semiconductor assembly: A case study
Purpose: Existing productivity improvements activities such as inventory buffer, overall equipment effectiveness (OEE) and total productive maintenance (TPM) do not analytically as...
Converged RAN/MEC slicing in beyond 5G (B5G) networks
Converged RAN/MEC slicing in beyond 5G (B5G) networks
(English) The main objective of this thesis is to propose solutions for implementing dynamic RAN slicing and Functional Split (FS) along with MEC placements in 5G/B5G. In particula...
Construction of a Feature Gene and Machine Prediction Model for Inflammatory Bowel Disease Based on Multi - Chip Joint Analysis
Construction of a Feature Gene and Machine Prediction Model for Inflammatory Bowel Disease Based on Multi - Chip Joint Analysis
Abstract Background Inflammatory bowel disease (IBD) is a chronic non - specific inflammatory disorder triggered by immune responses and genetic factors. Currently, there ...
N Optimizing Multi-Tenant DAG Execution Systems for High-Throughput Inference
N Optimizing Multi-Tenant DAG Execution Systems for High-Throughput Inference
In large-scale data processing and machine learning systems, Directed Acyclic Graphs (DAGs) serve as the backbone for orchestrating complex workflows that involve multiple dependen...
Identification of Key Volatiles Differentiating Aromatic Rice Cultivars Using an Untargeted Metabolomics Approach
Identification of Key Volatiles Differentiating Aromatic Rice Cultivars Using an Untargeted Metabolomics Approach
Non-aromatic rice is often sold at the price of aromatic rice to increase profits, seriously impairing consumer experience and brand credibility. The assessment of rice varieties o...

Back to Top