Javascript must be enabled to continue!

Choosing informative priors in Bayesian regression models. A simulation study and tutorial using Stan and R

Abstract Background Bayesian regression models provide a robust framework for complex data analysis, proving particularly advantageous in scenarios with small sample sizes common in medical research. However, specifying appropriate prior distributions, which incorporate existing knowledge to regularize model parameters, remains a challenge for many researchers. This can lead to unstable or implausible estimates. This study aims to demonstrate the impact of different prior distributions on regression models and provide a practical guide for choosing and justifying informative priors to produce more stable and credible results. Methods The study involved two parts. First, a simulation study was conducted to systematically assess the sensitivity of Bayesian linear regression models to prior specification. We systematically varied sample size, prior location, and prior scale to observe the impact on posterior estimates for a known true effect size. Second, a case-control study using real-world patient data (N = 526) demonstrated the practical application of choosing informative priors. Bayesian logistic regression models were used to analyse the relationship between severe dementia and fall incidence, comparing results from priors based on existing literature (“believer”), conservative priors (“agnostic”), and priors assuming an opposite effect (“sceptical”). Results The simulation study showed that strongly informative priors had a substantial influence on posterior estimates, particularly at smaller sample sizes. As the sample size increased, the influence of the data grew, and the estimates converged toward the true effect. In the case-control study, a standard frequentist analysis produced an odds ratio of 8.87 with a very wide and unstable confidence interval (1.66–165.19). In contrast, a Bayesian model using a moderately informative “believer” prior derived from existing research yielded a more plausible odds ratio of 4.40 with a substantially narrower and more precise credible interval (1.82–12.54). Conclusions The careful and transparent specification of informative priors is a critical tool in Bayesian analysis, especially when data are sparse. By incorporating justified, evidence-based assumptions, researchers can regularize models to prevent implausible outcomes and produce more stable, interpretable, and credible results. This approach enhances the robustness of statistical inference in fields where small sample sizes are a frequent challenge.

Springer Science and Business Media LLC

Daniel Lüdecke Anna Makowski Jens Klein Dominique Makowski

2025

Title: Choosing informative priors in Bayesian regression models. A simulation study and tutorial using Stan and R

Description:

Abstract Background Bayesian regression models provide a robust framework for complex data analysis, proving particularly advantageous in scenarios with small sample sizes common in medical research.

However, specifying appropriate prior distributions, which incorporate existing knowledge to regularize model parameters, remains a challenge for many researchers.

This can lead to unstable or implausible estimates.

This study aims to demonstrate the impact of different prior distributions on regression models and provide a practical guide for choosing and justifying informative priors to produce more stable and credible results.

Methods The study involved two parts.

First, a simulation study was conducted to systematically assess the sensitivity of Bayesian linear regression models to prior specification.

We systematically varied sample size, prior location, and prior scale to observe the impact on posterior estimates for a known true effect size.

Second, a case-control study using real-world patient data (N = 526) demonstrated the practical application of choosing informative priors.

Bayesian logistic regression models were used to analyse the relationship between severe dementia and fall incidence, comparing results from priors based on existing literature (“believer”), conservative priors (“agnostic”), and priors assuming an opposite effect (“sceptical”).

Results The simulation study showed that strongly informative priors had a substantial influence on posterior estimates, particularly at smaller sample sizes.

As the sample size increased, the influence of the data grew, and the estimates converged toward the true effect.

In the case-control study, a standard frequentist analysis produced an odds ratio of 8.

87 with a very wide and unstable confidence interval (1.

66–165.

19).

In contrast, a Bayesian model using a moderately informative “believer” prior derived from existing research yielded a more plausible odds ratio of 4.

40 with a substantially narrower and more precise credible interval (1.

82–12.

54).

Conclusions The careful and transparent specification of informative priors is a critical tool in Bayesian analysis, especially when data are sparse.

By incorporating justified, evidence-based assumptions, researchers can regularize models to prevent implausible outcomes and produce more stable, interpretable, and credible results.

This approach enhances the robustness of statistical inference in fields where small sample sizes are a frequent challenge.

Back

On 10 September 2012, a cat named Tuxedo Stan launched his campaign for mayor of the Halifax Regional Municipality in Nova Scotia, Canada (“Tuxedo Stan for Mayor”). Backed by his h...

Bayesian SEM with Small Samples: Precautions and Guidelines

Sometimes it can be challenging to collect enough data. Think of naturally small populations, such as people with rare diseases. Or hard to access target groups, such as people wit...

Bayesian regression modeling and inference of energy efficiency data: the effect of collinearity and sensitivity analysis

The majority of research predicted heating demand using linear regression models, but they did not give current building features enough context. Model problems such as Multicollin...

The Bayesian-Laplacian Brain

Abstract We outline what we believe could be an improvement in future discussions of the brain acting as a Bayesian-Laplacian system. We do so by...

Prior Setting In Practice: Strategies and rationales used in choosing prior distributions for Bayesian analysis

Bayesian statistical analysis is steadily growing in popularity and use. Choosing priors is an integral part of Bayesian inference. While there exist extensive normative recommenda...

Sample-efficient Optimization Using Neural Networks

<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligibl...

Incorporating Prior Genomic Dose-Response Data to Support the Benchmark Dose Estimation of Toxicogenomics

AbstractChemical risk assessment is an important tool to evaluate the toxicity of chemicals in the environment, and high throughput toxicogenomics plays an increasingly important r...

Sensory and action neural tuning explains how priors guide human visual decisions

Abstract Prior expectations bias how we perceive the world. Despite well-characterized behavioral effects of priors, such as confirmation bias, their neural mechani...

Email:
Password:

Email:

Choosing informative priors in Bayesian regression models. A simulation study and tutorial using Stan and R

Related Results