Non-parametric inference on calibration of predicted risks
Mohsen Sadatsafavi, John Petkau

TL;DR
This paper introduces a novel, statistically rigorous method for assessing moderate calibration of binary risk prediction models, avoiding arbitrary data grouping and providing joint inference on calibration metrics.
Contribution
The paper proposes a new Brownian motion-based inference method for moderate calibration, offering a unified test that improves power over existing techniques.
Findings
The bridge test outperforms existing methods in simulation studies.
The method allows joint inference on mean and moderate calibration.
An R package implementation is provided for practical use.
Abstract
Moderate calibration, the expected event probability among observations with predicted probability z being equal to z, is a desired property of risk prediction models. Current graphical and numerical techniques for evaluating moderate calibration of risk prediction models are mostly based on smoothing or grouping the data. As well, there is no widely accepted inferential method for the null hypothesis that a model is moderately calibrated. In this work, we discuss recently-developed, and propose novel, methods for the assessment of moderate calibration for binary responses. The methods are based on the limiting distributions of functions of standardized partial sums of prediction errors converging to the corresponding laws of Brownian motion. The novel method relies on well-known properties of the Brownian bridge which enables joint inference on mean and moderate calibration, leading to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Forecasting Techniques and Applications
