Javascript must be enabled to continue!

tableone: An open source Python package for producing summary statistics for research papers

AbstractObjectivesIn quantitative research, understanding basic parameters of the study population is key for interpretation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summary statistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible method for providing summary statistics for research papers in the Python programming language. Second, we seek to use the package to improve the quality of summary statistics reported in research papers.Materials and MethodsThe tableone package is developed following good practice guidelines for scientific computing and all code is made available under a permissive MIT License. A testing framework runs on a continuous integration server, helping to maintain code stability. Issues are tracked openly and public contributions are encouraged.ResultsThe tableone software package automatically compiles summary statistics into publishable formats such as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to a subset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s Dip Test for modality are computed to highlight potential issues in summarizing the data.Discussion and ConclusionWe present open source software for researchers to facilitate carrying out reproducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to mature over time with community feedback and input. Development of a common tool for summarizing data may help to promote good practice when used as a supplement to existing guidelines and recommendations. We encourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization to ensure appropriate data handling. We also suggest seeking guidance from a statistician when using tableone for a research study, especially prior to submitting the study for publication.

Oxford University Press (OUP)

Tom J Pollard Alistair E W Johnson Jesse D Raffa Roger G Mark

JAMIA Open

2018

Title: tableone: An open source Python package for producing summary statistics for research papers

Description:

AbstractObjectivesIn quantitative research, understanding basic parameters of the study population is key for interpretation of the results.

As a result, it is typical for the first table (“Table 1”) of a research paper to include summary statistics for the study data.

Our objectives are 2-fold.

First, we seek to provide a simple, reproducible method for providing summary statistics for research papers in the Python programming language.

Second, we seek to use the package to improve the quality of summary statistics reported in research papers.

Materials and MethodsThe tableone package is developed following good practice guidelines for scientific computing and all code is made available under a permissive MIT License.

A testing framework runs on a continuous integration server, helping to maintain code stability.

Issues are tracked openly and public contributions are encouraged.

ResultsThe tableone software package automatically compiles summary statistics into publishable formats such as CSV, HTML, and LaTeX.

An executable Jupyter Notebook demonstrates application of the package to a subset of data from the MIMIC-III database.

Tests such as Tukey’s rule for outlier detection and Hartigan’s Dip Test for modality are computed to highlight potential issues in summarizing the data.

Discussion and ConclusionWe present open source software for researchers to facilitate carrying out reproducible studies in Python, an increasingly popular language in scientific research.

The toolkit is intended to mature over time with community feedback and input.

Development of a common tool for summarizing data may help to promote good practice when used as a supplement to existing guidelines and recommendations.

We encourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization to ensure appropriate data handling.

We also suggest seeking guidance from a statistician when using tableone for a research study, especially prior to submitting the study for publication.

Back

Related Results

Basic and Advance: Phython Programming

"This book will introduce you to the python programming language. It's aimed at beginning programmers, but even if you have written programs before and just want to add python to y...

Predictors of Statistics Anxiety Among Graduate Students in Saudi Arabia

Problem The problem addressed in this study is the anxiety experienced by graduate students toward statistics courses, which often causes students to delay taking statistics cours...

Do evidence summaries increase health policy‐makers' use of evidence from systematic reviews? A systematic review

This review summarizes the evidence from six randomized controlled trials that judged the effectiveness of systematic review summaries on policymakers' decision making, or the most...

Bacterial genome annotation script using BLASTN v2

This protocol uses the command line tools provided by the Python package TnAtlas to identify and annotate transposon integration events in genomes. Given a set of sequencing reads...

PYTHON POWERED INTELLIGENCE AND ML

Python Powered Intelligence And ML is designed to be your essential companion in your journey through the world of Artificial Intelligence and Python programming. We understand th...

PyIOmica: Longitudinal Omics Analysis and Classification

AbstractSummaryPyIOmica is an open-source Python package focusing on integrating longitudinal multiple omics datasets, characterizing, and classifying temporal trends. The package ...

Autoprot: Processing, Analysis and Visualization of Proteomics Data in Python

MotivationThe increasing numbers of complex quantitative mass spectrometry-based proteomics data sets demand a standardised and reliable analysis pipeline. For this purpose, Python...

New airGR developments: semi-distribution and data assimilation

<p>airGR (Coron et al., 2017, 2020) is an R package that offers the possibility to use the GR rainfall-runoff models developed in the Hydrology Research Group at INRA...

Email:
Password:

Email: