Javascript must be enabled to continue!
Unmeasured human transcription factor ChIP-seq data shape functional genomics and demand strategic prioritization
View through CrossRef
Abstract
Transcription factor (TF) chromatin immunoprecipitation followed by sequencing (ChIP-seq) is essential for identifying genome-wide TF-binding sites (TFBSs), and the collected datasets offer a variety of opportunities for downstream analyses such as inference of gene regulatory network and prediction for effects of single-nucleotide polymorphisms (SNPs) on TFBSs. Although TF ChIP-seq data continue to accumulate in public databases, comprehensive coverage of biologically relevant TF-sample pairs (i.e. combination of targeted TF and cell type) remains elusive. This is due to the need for TF-specific antibodies and large cell numbers, limiting feasible TF–cell type combinations. Moreover, ChIP-seq is measurable when the TF is expressed in the target cell type. Thus, defining the full space of biologically relevant TF–sample pairs—including both measured and unmeasured—is essential to assess and improve dataset comprehensiveness. Here, we investigated publicly available human TF ChIP-seq datasets and introduced the concept of unmeasured TF-sample pairs, defined as biologically relevant TF–sample combinations for which ChIP-seq experiments have not yet been performed. Notably, many expressed TFs in specific cell types remain unmeasured by ChIP-seq, affecting the coverage of regulatory regions revealed by TF ChIP-seq and genome-wide association study–SNP analyses. Furthermore, we propose practical strategies to efficiently supplement currently unmeasured data and discuss how these approaches can significantly enhance data-driven research. The database of unmeasured human TF–sample pairs is publicly accessible at https://moccs-db.shinyapps.io/Unmeasured_shiny_v1/, facilitating the systematic expansion of TF ChIP-seq datasets and thereby enhancing our comprehension of gene regulatory mechanisms.
Title: Unmeasured human transcription factor ChIP-seq data shape functional genomics and demand strategic prioritization
Description:
Abstract
Transcription factor (TF) chromatin immunoprecipitation followed by sequencing (ChIP-seq) is essential for identifying genome-wide TF-binding sites (TFBSs), and the collected datasets offer a variety of opportunities for downstream analyses such as inference of gene regulatory network and prediction for effects of single-nucleotide polymorphisms (SNPs) on TFBSs.
Although TF ChIP-seq data continue to accumulate in public databases, comprehensive coverage of biologically relevant TF-sample pairs (i.
e.
combination of targeted TF and cell type) remains elusive.
This is due to the need for TF-specific antibodies and large cell numbers, limiting feasible TF–cell type combinations.
Moreover, ChIP-seq is measurable when the TF is expressed in the target cell type.
Thus, defining the full space of biologically relevant TF–sample pairs—including both measured and unmeasured—is essential to assess and improve dataset comprehensiveness.
Here, we investigated publicly available human TF ChIP-seq datasets and introduced the concept of unmeasured TF-sample pairs, defined as biologically relevant TF–sample combinations for which ChIP-seq experiments have not yet been performed.
Notably, many expressed TFs in specific cell types remain unmeasured by ChIP-seq, affecting the coverage of regulatory regions revealed by TF ChIP-seq and genome-wide association study–SNP analyses.
Furthermore, we propose practical strategies to efficiently supplement currently unmeasured data and discuss how these approaches can significantly enhance data-driven research.
The database of unmeasured human TF–sample pairs is publicly accessible at https://moccs-db.
shinyapps.
io/Unmeasured_shiny_v1/, facilitating the systematic expansion of TF ChIP-seq datasets and thereby enhancing our comprehension of gene regulatory mechanisms.
Related Results
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
Human tissues comprise trillions of cells that populate a complex space of molecular phenotypes and functions and that vary in abundance by 4–9 orders of magnitude. Relying solely ...
Comparing genome-wide chromatin profiles using ChIP-chip or ChIP-seq
Comparing genome-wide chromatin profiles using ChIP-chip or ChIP-seq
AbstractMotivation: ChIP-chip and ChIP-seq technologies provide genome-wide measurements of various types of chromatin marks at an unprecedented resolution. With ChIP samples colle...
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data
AbstractGene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA-seq data, which helps to decipher single-cell heterogeneity and ...
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
Generating Synthetic Single Cell Data from Bulk RNA-seq Using a Pretrained Variational Autoencoder
AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful approach which generates genome-wide gene expression profiles at single cell resolution. Among its many applications, i...
Next Generation Sequencing Technologies and Their Applications
Next Generation Sequencing Technologies and Their Applications
Abstract
The advances in next generation sequencing (NGS) technologies have tremendous impacts on the studies of structural and f...
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract
Introduction
Variant calling based on DNA samples has been the gold standard of clinical testing since the advent of Sanger sequencing. The u...
Abstract 4146122: Potential Protective Roles of Clonal Hematopoiesis of Indeterminate Potential in Angina Pectoris
Abstract 4146122: Potential Protective Roles of Clonal Hematopoiesis of Indeterminate Potential in Angina Pectoris
Introduction:
Clonal hematopoiesis of indeterminate potential (CHIP) poses strong relationship to the occurrence of cardiovascular diseases with the process of aging. I...
Chromatin endogenous cleavage provides a global view of RNA polymerase II transcription kinetics
Chromatin endogenous cleavage provides a global view of RNA polymerase II transcription kinetics
Abstract
Chromatin immunoprecipitation (ChIP-seq) is the most common approach to observe global binding of proteins to DNA in vivo. The occupancy of transcription f...

