Javascript must be enabled to continue!
Influence of multiple hypothesis testing on reproducibility in neuroimaging research
View through CrossRef
AbstractBackgroundReproducibility of research findings has been recently questioned in many fields of science, including psychology and neurosciences. One factor influencing reproducibility is the simultaneous testing of multiple hypotheses, which increases the number of false positive findings unless the p-values are carefully corrected. While this multiple testing problem is well known and has been studied for decades, it continues to be both a theoretical and practical problem.New MethodHere we assess the reproducibility of research involving multiple-testing corrected for family-wise error rate (FWER) or false discovery rate (FDR) by techniques based on random field theory (RFT), cluster-mass based permutation testing, adaptive FDR, and several classical methods. We also investigate the performance of these methods under two different models.ResultsWe found that permutation testing is the most powerful method among the considered approaches to multiple testing, and that grouping hypotheses based on prior knowledge can improve power. We also found that emphasizing primary and follow-up studies equally produced most reproducible outcomes.Comparison with Existing Method(s)We have extended the use of two-group and separate-classes models for analyzing reproducibility and provide a new open-source software “MultiPy” for multiple hypothesis testing.ConclusionsOur results suggest that performing strict corrections for multiple testing is not sufficient to improve reproducibility of neuroimaging experiments. The methods are freely available as a Python toolkit “MultiPy” and we aim this study to help in improving statistical data analysis practices and to assist in conducting power and reproducibility analyses for new experiments.
Title: Influence of multiple hypothesis testing on reproducibility in neuroimaging research
Description:
AbstractBackgroundReproducibility of research findings has been recently questioned in many fields of science, including psychology and neurosciences.
One factor influencing reproducibility is the simultaneous testing of multiple hypotheses, which increases the number of false positive findings unless the p-values are carefully corrected.
While this multiple testing problem is well known and has been studied for decades, it continues to be both a theoretical and practical problem.
New MethodHere we assess the reproducibility of research involving multiple-testing corrected for family-wise error rate (FWER) or false discovery rate (FDR) by techniques based on random field theory (RFT), cluster-mass based permutation testing, adaptive FDR, and several classical methods.
We also investigate the performance of these methods under two different models.
ResultsWe found that permutation testing is the most powerful method among the considered approaches to multiple testing, and that grouping hypotheses based on prior knowledge can improve power.
We also found that emphasizing primary and follow-up studies equally produced most reproducible outcomes.
Comparison with Existing Method(s)We have extended the use of two-group and separate-classes models for analyzing reproducibility and provide a new open-source software “MultiPy” for multiple hypothesis testing.
ConclusionsOur results suggest that performing strict corrections for multiple testing is not sufficient to improve reproducibility of neuroimaging experiments.
The methods are freely available as a Python toolkit “MultiPy” and we aim this study to help in improving statistical data analysis practices and to assist in conducting power and reproducibility analyses for new experiments.
Related Results
An Event Based Topic Learning Pipeline for Neuroimaging Literature Mining
An Event Based Topic Learning Pipeline for Neuroimaging Literature Mining
Abstract
Neuroimaging text mining extracts knowledge from neuroimaging text and has received widespread attention. Topic learning is an important research focus of neuroima...
Epidemiological characteristics and prevalence rates of research reproducibility across disciplines: A scoping review of articles published in 2018-2019
Epidemiological characteristics and prevalence rates of research reproducibility across disciplines: A scoping review of articles published in 2018-2019
Background:Reproducibility is a central tenant of research. We aimed to synthesize the literature on reproducibility and describe its epidemiological characteristics, including how...
The charm of structural neuroimaging in insanity evaluations: guidelines to avoid misinterpretation of the findings
The charm of structural neuroimaging in insanity evaluations: guidelines to avoid misinterpretation of the findings
AbstractDespite the popularity of structural neuroimaging techniques in twenty-first-century research, its results have had limited translational impact in real-world settings, whe...
Assessment of transparent and reproducible research practices in the psychiatry literature
Assessment of transparent and reproducible research practices in the psychiatry literature
Background
Reproducibility is a cornerstone of scientific advancement; however, many published works may lack the core components needed for study reproducibility...
Blunt Chest Trauma and Chylothorax: A Systematic Review
Blunt Chest Trauma and Chylothorax: A Systematic Review
Abstract
Introduction: Although traumatic chylothorax is predominantly associated with penetrating injuries, instances following blunt trauma, as a rare and challenging condition, ...
A training program for researchers in population neuroimaging: Early experiences
A training program for researchers in population neuroimaging: Early experiences
Recent advances in neuroimaging create groundbreaking opportunities to better understand human neurological and psychiatric diseases, but also bring new challenges. With the advent...
A validation framework for neuroimaging software: the case of population receptive fields
A validation framework for neuroimaging software: the case of population receptive fields
AbstractNeuroimaging software methods are complex, making it a near certainty that some implementations will contain errors. Modern computational techniques (i.e., public code and ...
Do psychology students interpret null hypothesis significance testing critically?
Do psychology students interpret null hypothesis significance testing critically?
The goal of the study was to descriptively analyze the understanding of null hypothesis significance testing among Croatian psychology students considering how it is usually unders...

