Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Test-statistic correlation and data-row correlation

View through CrossRef
Abstract When a statistical test is repeatedly applied to rows of a data matrix—such as in differential-expression analysis of gene expression data, correlations among data rows will give rise to correlations among corresponding test statistic values. Correlations among test statistic values create many inferential challenges in false-discovery-rate control procedures, gene-set enrichment analysis, or other procedures aiming to summarize the collection of test results. To tackle these challenges, researchers sometimes will—explicitly or implicitly—use the correlations (e.g., as measured by the Pearson correlation coefficients) among the data rows to approximate the correlations among the corresponding test statistic values. We show that, however, such approximations are only valid under limited settings. We investigate the relationship between the correlation coefficient between a pair of test statistics (test-statistic correlation) and the correlation coefficient between the two corresponding data rows (data-row correlation). We derive an analytical formula for the test-statistic correlation as a function of the data-row correlation for a general class of test statistics: in particular, two-sample t -test is a special case. The analytical formula implies that the test-statistic correlation is generally weaker than the corresponding data-row correlation, and in general, the latter will not well approximate the former when the involved null hypotheses are false. We verify our analytical results through simulations.
Title: Test-statistic correlation and data-row correlation
Description:
Abstract When a statistical test is repeatedly applied to rows of a data matrix—such as in differential-expression analysis of gene expression data, correlations among data rows will give rise to correlations among corresponding test statistic values.
Correlations among test statistic values create many inferential challenges in false-discovery-rate control procedures, gene-set enrichment analysis, or other procedures aiming to summarize the collection of test results.
To tackle these challenges, researchers sometimes will—explicitly or implicitly—use the correlations (e.
g.
, as measured by the Pearson correlation coefficients) among the data rows to approximate the correlations among the corresponding test statistic values.
We show that, however, such approximations are only valid under limited settings.
We investigate the relationship between the correlation coefficient between a pair of test statistics (test-statistic correlation) and the correlation coefficient between the two corresponding data rows (data-row correlation).
We derive an analytical formula for the test-statistic correlation as a function of the data-row correlation for a general class of test statistics: in particular, two-sample t -test is a special case.
The analytical formula implies that the test-statistic correlation is generally weaker than the corresponding data-row correlation, and in general, the latter will not well approximate the former when the involved null hypotheses are false.
We verify our analytical results through simulations.

Related Results

Row Orientation and Planting Pattern of Relay Intercropped Soybean and Wheat
Row Orientation and Planting Pattern of Relay Intercropped Soybean and Wheat
Relay intercropping soybean [Glycine max(L.) Merr.] into winter wheat (Triticum aestivum L.) may increase soybean yields compared with doublecropping. Once the soybean crop is esta...
Evaluating Intercropping Limitations of Cowpea (Vignaunguiculata L.), Pearl Millet (Pennisetumglaucum L.), and Maize (Zea Mays L.)
Evaluating Intercropping Limitations of Cowpea (Vignaunguiculata L.), Pearl Millet (Pennisetumglaucum L.), and Maize (Zea Mays L.)
Fodder scarcity is a main problem in boosting of livestock sector. Hypothesis was made in order to increase fodder yield per unit of land by intercropping of cowpeas, pearl millet ...
The Robustness of the Modified H-Statistic in the Test of Comparing Independent Groups
The Robustness of the Modified H-Statistic in the Test of Comparing Independent Groups
The H-statistic is a robust test statistic in comparing the equality of two and more than two independent groups. This statistic is one of a good alternative to the F-statistic in ...
Provocative Tests in Diagnosis of Thoracic Outlet Syndrome: A Narrative Review
Provocative Tests in Diagnosis of Thoracic Outlet Syndrome: A Narrative Review
Abstract Thoracic outlet syndrome (TOS) is a group of conditions caused by the compression of the neurovascular bundle within the thoracic outlet. It is classified into three main ...
Intercropping of Cabbage with Maize
Intercropping of Cabbage with Maize
The experiment was carried out at the research field of Agricultural Research Station, Rajbari, Dinajpur (Latitude: 25.63544, Longitude: 88.65144) during rabi season of 2016-2017 a...
Stomatal Response to High Evaporative Demand in Irrigated Grain Sorghum in Narrow and Wide Row Spacing
Stomatal Response to High Evaporative Demand in Irrigated Grain Sorghum in Narrow and Wide Row Spacing
AbstractStomatal activity of leaves can be related to factors under producer control, including row spacing and orientation. In both grain sorghum [Sorghum bicolor (L.) Moench] and...
Effect of Sorghum-Mung Bean Intercropping on Sorghum-Based Cropping System in the Lowlands of North Shewa, Ethiopia
Effect of Sorghum-Mung Bean Intercropping on Sorghum-Based Cropping System in the Lowlands of North Shewa, Ethiopia
Due to decreasing land units and a decline in soil fertility, integrating mung beans into the Sorghum production system is a viable option for increasing productivity and producing...

Back to Top