Javascript must be enabled to continue!
Gene Set Analysis Using Spatial Statistics
View through CrossRef
Gene differential expression consists of the study of the possible association between the gene expression, evaluated using different types of data as DNA microarray or RNA-Seq technologies, and the phenotype. This can be performed marginally for each gene (differential gene expression) or using a gene set collection (gene set analysis). A previous (marginal) per-gene analysis of differential expression is usually performed in order to obtain a set of significant genes or marginal p-values used later in the study of association between phenotype and gene expression. This paper proposes the use of methods of spatial statistics for testing gene set differential expression analysis using paired samples of RNA-Seq counts. This approach is not based on a previous per-gene differential expression analysis. Instead, we compare the paired counts within each sample/control using a binomial test. Each pair per gene will produce a p-value so gene expression profile is transformed into a vector of p-values which will be considered as an event belonging to a point pattern. This would be the first component of a bivariate point pattern. The second component is generated by applying two different randomization distributions to the correspondence between samples and treatment. The self-contained null hypothesis considered in gene set analysis can be formulated in terms of the associated point pattern as a random labeling of the considered bivariate point pattern. The gene sets were defined by the Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The proposed methodology was tested in four RNA-Seq datasets of colorectal cancer (CRC) patients and the results were contrasted with those obtained using the edgeR-GOseq pipeline. The proposed methodology has proved to be consistent at the biological and statistical level, in particular using Cuzick and Edwards test with one realization of the second component and between-pair distribution.
Title: Gene Set Analysis Using Spatial Statistics
Description:
Gene differential expression consists of the study of the possible association between the gene expression, evaluated using different types of data as DNA microarray or RNA-Seq technologies, and the phenotype.
This can be performed marginally for each gene (differential gene expression) or using a gene set collection (gene set analysis).
A previous (marginal) per-gene analysis of differential expression is usually performed in order to obtain a set of significant genes or marginal p-values used later in the study of association between phenotype and gene expression.
This paper proposes the use of methods of spatial statistics for testing gene set differential expression analysis using paired samples of RNA-Seq counts.
This approach is not based on a previous per-gene differential expression analysis.
Instead, we compare the paired counts within each sample/control using a binomial test.
Each pair per gene will produce a p-value so gene expression profile is transformed into a vector of p-values which will be considered as an event belonging to a point pattern.
This would be the first component of a bivariate point pattern.
The second component is generated by applying two different randomization distributions to the correspondence between samples and treatment.
The self-contained null hypothesis considered in gene set analysis can be formulated in terms of the associated point pattern as a random labeling of the considered bivariate point pattern.
The gene sets were defined by the Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.
The proposed methodology was tested in four RNA-Seq datasets of colorectal cancer (CRC) patients and the results were contrasted with those obtained using the edgeR-GOseq pipeline.
The proposed methodology has proved to be consistent at the biological and statistical level, in particular using Cuzick and Edwards test with one realization of the second component and between-pair distribution.
Related Results
Predictors of Statistics Anxiety Among Graduate Students in Saudi Arabia
Predictors of Statistics Anxiety Among Graduate Students in Saudi Arabia
Problem The problem addressed in this study is the anxiety experienced by graduate students toward statistics courses, which often causes students to delay taking statistics cours...
Expression and polymorphism of genes in gallstones
Expression and polymorphism of genes in gallstones
ABSTRACT
Through the method of clinical case control study, to explore the expression and genetic polymorphism of KLF14 gene (rs4731702 and rs972283) and SR-B1 gene...
Spatial Statistics along Networks
Spatial Statistics along Networks
AbstractSpatial statistics along networks is a branch of spatial statistics. Traditional spatial statistics deals with events occurring on a plane, referred to asplanar spatial sta...
Imputation of Spatially-resolved Transcriptomes by Graph-regularized Tensor Completion
Imputation of Spatially-resolved Transcriptomes by Graph-regularized Tensor Completion
Abstract
High-throughput spatial-transcriptomics RNA sequencing (sptRNA-seq) based on in-situ capturing technologies has recently been developed ...
Editorial Messages
Editorial Messages
Just as it has been continually happening in the world of mathematical sciences, the group of mathematical scientists led by (for example) Professor Eyup Cetin and his colleagues (...
Territories -in- between
Territories -in- between
There is an increasing body of literature suggesting that the conventional idea of a gradual transition in spatial structure from urban to rural does not properly reflect contempor...
FAIR Digital Objects in Official Statistics
FAIR Digital Objects in Official Statistics
Introduction*1
Statistical offices on national and international scale provide statistics on demography, labour, income, society, economy, environment and othe...
Pinaceae show elevated rates of gene duplication and gene loss that are robust to incomplete gene annotation
Pinaceae show elevated rates of gene duplication and gene loss that are robust to incomplete gene annotation
Abstract
Gene duplications and gene losses are major determinants of genome evolution and phenotypic diversity. The frequency of gene turnover (gene gains and gene ...

