Javascript must be enabled to continue!

Performance metrics for models designed to predict treatment effect

ABSTRACTBackgroundMeasuring the performance of models that predict individualized treatment effect is challenging because the outcomes of two alternative treatments are inherently unobservable in one patient. The C-for-benefit was proposed to measure discriminative ability. However, measures of calibration and overall performance are still lacking. We aimed to propose metrics of calibration and overall performance for models predicting treatment effect.MethodsSimilar to the previously proposed C-for-benefit, we defined observed pairwise treatment effect as the difference between outcomes in pairs of matched patients with different treatment assignment. We redefined the E-statistics, the cross-entropy, and the Brier score into metrics for measuring a model’s ability to predict treatment effect. In a simulation study, the metric values of deliberately “perturbed models” were compared to those of the data-generating model, i.e., “optimal model”. To illustrate these performance metrics, different modeling approaches for predicting treatment effect are applied to the data of the Diabetes Prevention Program: 1) a risk modelling approach with restricted cubic splines; 2) an effect modelling approach including penalized treatment interactions; and 3) the causal forest.ResultsAs desired, performance metric values of “perturbed models” were consistently worse than those of the “optimal model” (Eavg-for-benefit≥0.070 versus 0.001, E90-for-benefit≥0.115 versus 0.003, cross-entropy-for-benefit≥0.757 versus 0.733, Brier-for-benefit≥0.215 versus 0.212). Calibration, discriminative ability, and overall performance of three different models were similar in the case study. The proposed metrics were implemented in a publicly available R-package “HTEPredictionMetrics”.ConclusionThe proposed metrics are useful to assess the calibration and overall performance of models predicting treatment effect.

Cold Spring Harbor Laboratory

C.C.H.M. Maas D.M. Kent M.C. Hughes R. Dekker H.F. Lingsma D. van Klaveren

2022

Title: Performance metrics for models designed to predict treatment effect

Description:

The C-for-benefit was proposed to measure discriminative ability.

However, measures of calibration and overall performance are still lacking.

We aimed to propose metrics of calibration and overall performance for models predicting treatment effect.

MethodsSimilar to the previously proposed C-for-benefit, we defined observed pairwise treatment effect as the difference between outcomes in pairs of matched patients with different treatment assignment.

We redefined the E-statistics, the cross-entropy, and the Brier score into metrics for measuring a model’s ability to predict treatment effect.

In a simulation study, the metric values of deliberately “perturbed models” were compared to those of the data-generating model, i.

, “optimal model”.

To illustrate these performance metrics, different modeling approaches for predicting treatment effect are applied to the data of the Diabetes Prevention Program: 1) a risk modelling approach with restricted cubic splines; 2) an effect modelling approach including penalized treatment interactions; and 3) the causal forest.

ResultsAs desired, performance metric values of “perturbed models” were consistently worse than those of the “optimal model” (Eavg-for-benefit≥0.

070 versus 0.

001, E90-for-benefit≥0.

115 versus 0.

003, cross-entropy-for-benefit≥0.

757 versus 0.

733, Brier-for-benefit≥0.

215 versus 0.

212).

Calibration, discriminative ability, and overall performance of three different models were similar in the case study.

The proposed metrics were implemented in a publicly available R-package “HTEPredictionMetrics”.

ConclusionThe proposed metrics are useful to assess the calibration and overall performance of models predicting treatment effect.

Back

Abstract Background Measuring the performance of models that predict individualized treatment effect is challenging because the outcomes of two alte...

The Impact of IL28B Gene Polymorphisms on Drug Responses

To achieve high therapeutic efficacy in the patient, information on pharmacokinetics, pharmacodynamics, and pharmacogenetics is required. With the development of science and techno...

THE SECURITY AND PRIVACY MEASURING SYSTEM FOR THE INTERNET OF THINGS DEVICES

The purpose of the article: elimination of the gap in existing need in the set of clear and objective security and privacy metrics for the IoT devices users and manufacturers and a...

Corporate environmental reporting: what's in a metric?

AbstractAlthough there has been increased attention to corporate environmental reports (CERs), there has yet to be a close examination of the metrics used in these reports. Metrics...

Neural Embedding-Based Metrics for Pre-retrieval Query Performance Prediction

<p>Pre-retrieval Query Performance Prediction (QPP) methods are oblivious to the performance of the retrieval model as they predict query difficulty prior to observing the se...

Neural Embedding-Based Metrics for Pre-retrieval Query Performance Prediction

<p>Pre-retrieval Query Performance Prediction (QPP) methods are oblivious to the performance of the retrieval model as they predict query difficulty prior to observing the se...

Machine learning for aircraft trajectory prediction: a solution for pre-tactical air traffic flow management

(English) The goal of air traffic flow and capacity management (ATFCM) is to ensure that airport and airspace capacity meet traffic demand while optimising traffic flows to avoid e...

Current therapeutic strategies for erectile function recovery after radical prostatectomy – literature review and meta-analysis

Radical prostatectomy is the most commonly performed treatment option for localised prostate cancer. In the last decades the surgical technique has been improved and modified in or...

Email:
Password:

Email:

Performance metrics for models designed to predict treatment effect

Related Results