Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods

View through CrossRef
AbstractComputational cell type deconvolution enables estimation of cell type abundance from bulk tissues and is important for understanding cell-cell interactions, especially in tumor tissues. With rapid development of deconvolution methods, many benchmarking studies have been published aiming for a comprehensive evaluation for these methods. Benchmarking studies rely on cell-type resolved single-cell RNA-seq data to create simulated pseudbulk datasets by adding individual cells-types in controlled proportions. In our work we show that the standard application of this approach, which uses randomly selected single cells, regardless of the intrinsic difference between them, generates synthetic bulk expression values that lack appropriate biological variance. We demonstrate why and how the current bulk simulation pipeline with random cells is unrealistic and propose a heterogeneous simulation strategy as a solution. Our heterogeneously simulated samples show realistic variance across hallmark gene-sets when comparing with real bulk samples from the TCGA dataset of the same tumor type. Using this new simulation pipeline to benchmark deconvolution methods we show that introducing biological heterogeneity has a notable effect on the results. Evaluating the robustness of different deconvolution approaches to heterogeneous simulation we find that reference-free methods that rely on simplex estimation perform poorly, marker-based methods and BayesPrism are most robust, while regress-based approaches fall in between. Importantly, we find that under the heterogeneous scenario marker based methods and BayesPrism outperform state of the art reference methods. Our findings highlight how different conceptual approaches can negate unmodeled heterogeneity and suggest that there is room for further methodological development.
Cold Spring Harbor Laboratory
Title: Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods
Description:
AbstractComputational cell type deconvolution enables estimation of cell type abundance from bulk tissues and is important for understanding cell-cell interactions, especially in tumor tissues.
With rapid development of deconvolution methods, many benchmarking studies have been published aiming for a comprehensive evaluation for these methods.
Benchmarking studies rely on cell-type resolved single-cell RNA-seq data to create simulated pseudbulk datasets by adding individual cells-types in controlled proportions.
In our work we show that the standard application of this approach, which uses randomly selected single cells, regardless of the intrinsic difference between them, generates synthetic bulk expression values that lack appropriate biological variance.
We demonstrate why and how the current bulk simulation pipeline with random cells is unrealistic and propose a heterogeneous simulation strategy as a solution.
Our heterogeneously simulated samples show realistic variance across hallmark gene-sets when comparing with real bulk samples from the TCGA dataset of the same tumor type.
Using this new simulation pipeline to benchmark deconvolution methods we show that introducing biological heterogeneity has a notable effect on the results.
Evaluating the robustness of different deconvolution approaches to heterogeneous simulation we find that reference-free methods that rely on simplex estimation perform poorly, marker-based methods and BayesPrism are most robust, while regress-based approaches fall in between.
Importantly, we find that under the heterogeneous scenario marker based methods and BayesPrism outperform state of the art reference methods.
Our findings highlight how different conceptual approaches can negate unmodeled heterogeneity and suggest that there is room for further methodological development.

Related Results

Sparsity‐enhanced wavelet deconvolution
Sparsity‐enhanced wavelet deconvolution
ABSTRACTWe propose a three‐step bandwidth enhancing wavelet deconvolution process, combining linear inverse filtering and non‐linear reflectivity construction based on a sparseness...
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
Human tissues comprise trillions of cells that populate a complex space of molecular phenotypes and functions and that vary in abundance by 4–9 orders of magnitude. Relying solely ...
Abstract 1554: Development of a deconvolution algorithm for tissue-based gene expression data
Abstract 1554: Development of a deconvolution algorithm for tissue-based gene expression data
Abstract Tissue data provide substantially more information than cell-line data, and offer new opportunities to study cancer biology and evolution in its actual micr...
An optimisational model of benchmarking
An optimisational model of benchmarking
PurposeThe purpose of this paper is to develop a quantitative methodology for benchmarking process which is simple, effective and efficient as a rejoinder to benchmarking detractor...
A review on benchmarking of supply chain performance measures
A review on benchmarking of supply chain performance measures
PurposeThe purpose of this paper is to redress the imbalances in the past literature of supply chain benchmarking and enhance data envelopment analysis (DEA) modeling approach in s...
Wave Scattering Deconvolution
Wave Scattering Deconvolution
ABSTRACT The least-squares approach is commonly used for spiking and predictive deconvolution. An alternative approach is wave scattering deconvolution (WSD) prop...
MuSiC2: cell type deconvolution for multi-condition bulk RNA-seq data
MuSiC2: cell type deconvolution for multi-condition bulk RNA-seq data
ABSTRACTCell type composition of intact bulk tissues can vary across samples. Deciphering cell type composition and its changes during disease progression is an important step towa...
Deconvolution Methods to Link Multi‐Omics Data to Cell Type‐Specific Extracellular Vesicle Abundances
Deconvolution Methods to Link Multi‐Omics Data to Cell Type‐Specific Extracellular Vesicle Abundances
ABSTRACTExtracellular vesicles (EVs) provide non‐invasive information on cellular health and disease. Yet, with the small size of EVs and more than 200 cell types contributing EVs ...

Back to Top