Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

RUV-III-NB: Normalization of single cell RNA-seq Data

View through CrossRef
Abstract Despite numerous methodological advances, the normalization of single cell RNA-seq (scRNA-seq) data remains a challenging task and the performance of different methods can vary greatly across datasets. Part of the reason for this is the different kinds of unwanted variation, including library size, batch and cell cycle effects, and the association of these with the biology embodied in the cells. A normalization method that does not explicitly take into account cell biology risks removing some of the signal of interest. Furthermore, most normalization methods remove the effects of unwanted variation for the cell embedding used for clustering-based analysis but not from gene-level data typically used for differential expression (DE) analysis to identify marker genes. Here we propose RUV-III-NB, a statistical method that can be used to remove unwanted variation from both the cell embedding and gene-level counts. RUV-III-NB explicitly takes into account its potential association with biology when removing unwanted variation via the use of pseudo-replicates. The method can be used for both UMI or sequence read counts and returns adjusted counts that can be used for downstream analyses such as clustering, DE and pseudotime analyses. Using five publicly available datasets that encompass different technological platforms, kinds of biology and levels of association between biology and unwanted variation, we show that RUV-III-NB manages to remove library size and batch effects, strengthen biological signals, improve differential expression analyses, and lead to results exhibiting greater concordance with independent datasets of the same kind. The performance of RUV-III-NB is consistent across the five datasets and is not sensitive to the number of factors assumed to contribute to the unwanted variation. It also shows promise for removing other kinds of unwanted variation such as platform effects. The method is implemented as a publicly available R package available from https://github.com/limfuxing/ruvIIInb .
Title: RUV-III-NB: Normalization of single cell RNA-seq Data
Description:
Abstract Despite numerous methodological advances, the normalization of single cell RNA-seq (scRNA-seq) data remains a challenging task and the performance of different methods can vary greatly across datasets.
Part of the reason for this is the different kinds of unwanted variation, including library size, batch and cell cycle effects, and the association of these with the biology embodied in the cells.
A normalization method that does not explicitly take into account cell biology risks removing some of the signal of interest.
Furthermore, most normalization methods remove the effects of unwanted variation for the cell embedding used for clustering-based analysis but not from gene-level data typically used for differential expression (DE) analysis to identify marker genes.
Here we propose RUV-III-NB, a statistical method that can be used to remove unwanted variation from both the cell embedding and gene-level counts.
RUV-III-NB explicitly takes into account its potential association with biology when removing unwanted variation via the use of pseudo-replicates.
The method can be used for both UMI or sequence read counts and returns adjusted counts that can be used for downstream analyses such as clustering, DE and pseudotime analyses.
Using five publicly available datasets that encompass different technological platforms, kinds of biology and levels of association between biology and unwanted variation, we show that RUV-III-NB manages to remove library size and batch effects, strengthen biological signals, improve differential expression analyses, and lead to results exhibiting greater concordance with independent datasets of the same kind.
The performance of RUV-III-NB is consistent across the five datasets and is not sensitive to the number of factors assumed to contribute to the unwanted variation.
It also shows promise for removing other kinds of unwanted variation such as platform effects.
The method is implemented as a publicly available R package available from https://github.
com/limfuxing/ruvIIInb .

Related Results

Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract Introduction Variant calling based on DNA samples has been the gold standard of clinical testing since the advent of Sanger sequencing. The u...
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful approach which generates genome-wide gene expression profiles at single cell resolution. Among its many applications, i...
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
AbstractGene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA-seq data, which helps to decipher single-cell heterogeneity and ...
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
ABSTRACTCurrently, DNA and RNA are used separately to capture different types of gene mutations. DNA is commonly used for the detection of SNVs, indels and CNVs; RNA is used for an...
Complex Collision Tumors: A Systematic Review
Complex Collision Tumors: A Systematic Review
Abstract Introduction: A collision tumor consists of two distinct neoplastic components located within the same organ, separated by stromal tissue, without histological intermixing...
Detecting RNA–RNA interactome
Detecting RNA–RNA interactome
AbstractThe last decade has seen a robust increase in various types of novel RNA molecules and their complexity in gene regulation. RNA molecules play a critical role in cellular e...
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract Background: Fresh frozen (FF) and formalin-fixed paraffin-embedded (FFPE) samples are primary resources for archival tissues in cancer studies. Despite the ...
MuSiC2: cell type deconvolution for multi-condition bulk RNA-seq data
MuSiC2: cell type deconvolution for multi-condition bulk RNA-seq data
ABSTRACTCell type composition of intact bulk tissues can vary across samples. Deciphering cell type composition and its changes during disease progression is an important step towa...

Back to Top