Javascript must be enabled to continue!
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
View through CrossRef
AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful approach which generates genome-wide gene expression profiles at single cell resolution. Among its many applications, it enables determination of the transcriptional states of distinct cell types in complex tissues, thereby allowing the precise cell type and set of genes driving a disease to be identified. However, scRNA-seq remains costly, and there are extremely limited samples generated in even the most extensive human disease studies. In sharp contrast, there is a wealth of publicly available bulk RNA-seq data, in which single cell and cell type information are effectively averaged. To further leverage this wealth of RNA-seq data, methods have been developed to infer the fraction of cell types from bulk RNA-seq data using single cell data to train models. Additionally, generative AI models have been developed to generate more of an existing scRNA-seq dataset. In this study, we develop an innovative framework that takes full advantage of powerful generative AI approaches and existing scRNA-seq data to generate representative scRNA-seq data from bulk RNA-seq. Our bulk to single cell variational autoencoder-based model, termedbulk2sc, is trained to deconvolve pseudo-bulk RNA-seq datasets back into their constituent single-cell transcriptomes by learning the specific distributions and proportions related to each cell type. We assess the performance of bulk2sc by comparing synthetically generated scRNA-seq to actual scRNA-seq data. Application of bulk2sc to large-scale bulk RNA-seq human disease datasets could yield single cell level insights into disease processes and suggest targeted scRNA-seq experiments.
Title: Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
Description:
AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful approach which generates genome-wide gene expression profiles at single cell resolution.
Among its many applications, it enables determination of the transcriptional states of distinct cell types in complex tissues, thereby allowing the precise cell type and set of genes driving a disease to be identified.
However, scRNA-seq remains costly, and there are extremely limited samples generated in even the most extensive human disease studies.
In sharp contrast, there is a wealth of publicly available bulk RNA-seq data, in which single cell and cell type information are effectively averaged.
To further leverage this wealth of RNA-seq data, methods have been developed to infer the fraction of cell types from bulk RNA-seq data using single cell data to train models.
Additionally, generative AI models have been developed to generate more of an existing scRNA-seq dataset.
In this study, we develop an innovative framework that takes full advantage of powerful generative AI approaches and existing scRNA-seq data to generate representative scRNA-seq data from bulk RNA-seq.
Our bulk to single cell variational autoencoder-based model, termedbulk2sc, is trained to deconvolve pseudo-bulk RNA-seq datasets back into their constituent single-cell transcriptomes by learning the specific distributions and proportions related to each cell type.
We assess the performance of bulk2sc by comparing synthetically generated scRNA-seq to actual scRNA-seq data.
Application of bulk2sc to large-scale bulk RNA-seq human disease datasets could yield single cell level insights into disease processes and suggest targeted scRNA-seq experiments.
Related Results
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract
Introduction
Variant calling based on DNA samples has been the gold standard of clinical testing since the advent of Sanger sequencing. The u...
MuSiC2: cell type deconvolution for multi-condition bulk RNA-seq data
MuSiC2: cell type deconvolution for multi-condition bulk RNA-seq data
ABSTRACTCell type composition of intact bulk tissues can vary across samples. Deciphering cell type composition and its changes during disease progression is an important step towa...
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
AbstractGene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA-seq data, which helps to decipher single-cell heterogeneity and ...
Complex Collision Tumors: A Systematic Review
Complex Collision Tumors: A Systematic Review
Abstract
Introduction: A collision tumor consists of two distinct neoplastic components located within the same organ, separated by stromal tissue, without histological intermixing...
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
ABSTRACTCurrently, DNA and RNA are used separately to capture different types of gene mutations. DNA is commonly used for the detection of SNVs, indels and CNVs; RNA is used for an...
Detecting RNA–RNA interactome
Detecting RNA–RNA interactome
AbstractThe last decade has seen a robust increase in various types of novel RNA molecules and their complexity in gene regulation. RNA molecules play a critical role in cellular e...
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract 2323: Deciphering RNA degradation: Insights from a comparative analysis of paired fresh frozen/FFPE total RNA-seq
Abstract
Background: Fresh frozen (FF) and formalin-fixed paraffin-embedded (FFPE) samples are primary resources for archival tissues in cancer studies. Despite the ...
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Objective: To determine the frequency of common chromosomal aberrations in local population idiopathic determine the frequency of common chromosomal aberrations in local population...

