Javascript must be enabled to continue!
AuPairWise: a method to estimate RNA-seq replicability through co-expression
View through CrossRef
Abstract
In addition to detecting novel transcripts and higher dynamic range, a principal claim for RNA-sequencing has been greater replicability, typically measured in sample-sample correlations of gene expression levels. Through a re-analysis of ENCODE data, we show that replicability of transcript abundances will provide misleading estimates of the replicability of conditional variation in transcript abundances (i.e., most expression experiments). Heuristics which implicitly address this problem have emerged in quality control measures to obtain ‘good’ differential expression results. However, these methods involve strict filters such as discarding low expressing genes or using technical replicates to remove discordant transcripts, and are costly or simply ad hoc. As an alternative, we model gene-level replicability of differential activity using co-expressing genes. We find that sets of housekeeping interactions provide a sensitive means of estimating the replicability of expression changes, where the co-expressing pair can be regarded as pseudo-replicates of one another. We model the effects of noise that perturbs a gene’s expression within its usual distribution of values and show that perturbing expression by only 5% within that range is readily detectable (AUROC~0.73). We have made our method available as a set of easily implemented R scripts.
Author Summary
RNA-sequencing has become a popular means to detect the expression levels of genes. However, quality control is still challenging, requiring both extreme measures and rules which are set in stone from extensive previous analysis. Instead of relying on these rules, we show that co-expression can be used to measure biological replicability with extremely high precision. Co-expression is a well-studied phenomenon, in which two genes that are known to form a functional unit are also expressed at similar levels, and change in similar ways across conditions. Using this concept, we can detect how well an experiment replicates by measuring how well it has retained the co-expression pattern across defined gene-pairs. We do this by measuring how easy it is to detect a sample to which some noise has been added. We show this method is a useful tool for quality control.
Title: AuPairWise: a method to estimate RNA-seq replicability through co-expression
Description:
Abstract
In addition to detecting novel transcripts and higher dynamic range, a principal claim for RNA-sequencing has been greater replicability, typically measured in sample-sample correlations of gene expression levels.
Through a re-analysis of ENCODE data, we show that replicability of transcript abundances will provide misleading estimates of the replicability of conditional variation in transcript abundances (i.
e.
, most expression experiments).
Heuristics which implicitly address this problem have emerged in quality control measures to obtain ‘good’ differential expression results.
However, these methods involve strict filters such as discarding low expressing genes or using technical replicates to remove discordant transcripts, and are costly or simply ad hoc.
As an alternative, we model gene-level replicability of differential activity using co-expressing genes.
We find that sets of housekeeping interactions provide a sensitive means of estimating the replicability of expression changes, where the co-expressing pair can be regarded as pseudo-replicates of one another.
We model the effects of noise that perturbs a gene’s expression within its usual distribution of values and show that perturbing expression by only 5% within that range is readily detectable (AUROC~0.
73).
We have made our method available as a set of easily implemented R scripts.
Author Summary
RNA-sequencing has become a popular means to detect the expression levels of genes.
However, quality control is still challenging, requiring both extreme measures and rules which are set in stone from extensive previous analysis.
Instead of relying on these rules, we show that co-expression can be used to measure biological replicability with extremely high precision.
Co-expression is a well-studied phenomenon, in which two genes that are known to form a functional unit are also expressed at similar levels, and change in similar ways across conditions.
Using this concept, we can detect how well an experiment replicates by measuring how well it has retained the co-expression pattern across defined gene-pairs.
We do this by measuring how easy it is to detect a sample to which some noise has been added.
We show this method is a useful tool for quality control.
Related Results
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract
Introduction
Variant calling based on DNA samples has been the gold standard of clinical testing since the advent of Sanger sequencing. The u...
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
ABSTRACTCurrently, DNA and RNA are used separately to capture different types of gene mutations. DNA is commonly used for the detection of SNVs, indels and CNVs; RNA is used for an...
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract
Background: Fresh frozen (FF) and formalin-fixed paraffin-embedded (FFPE) samples are primary resources for archival tissues in cancer studies. Despite the ...
Detecting RNA–RNA interactome
Detecting RNA–RNA interactome
AbstractThe last decade has seen a robust increase in various types of novel RNA molecules and their complexity in gene regulation. RNA molecules play a critical role in cellular e...
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful approach which generates genome-wide gene expression profiles at single cell resolution. Among its many applications, i...
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
AbstractGene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA-seq data, which helps to decipher single-cell heterogeneity and ...
Diagnostic Validation of a Clinical Laboratory-Oriented Targeted RNA Sequencing System As a Comprehensive Assay for Hematologic Malignancies
Diagnostic Validation of a Clinical Laboratory-Oriented Targeted RNA Sequencing System As a Comprehensive Assay for Hematologic Malignancies
Introduction Targeted RNA sequencing (RNA-seq) is a highly accurate method for sequencing transcripts of interest and can overcome limitations regarding resolution, throughput, and...
A Nile Grass Rat Transcriptomic Landscape Across 22 Organs By Ultra-deep Sequencing and Comparative RNA-seq pipeline (CRSP)
A Nile Grass Rat Transcriptomic Landscape Across 22 Organs By Ultra-deep Sequencing and Comparative RNA-seq pipeline (CRSP)
Abstract
RNA sequencing (RNA-seq) has been a widely used high-throughput method to characterize transcriptomic dynamics spatiotemporally. However...

