Universal Inference for Testing Calibration of Mean Estimates within the Exponential Dispersion Family

{\L}ukasz Delong; Mario W\"uthrich

arXiv:2510.23821·stat.AP·October 29, 2025

Universal Inference for Testing Calibration of Mean Estimates within the Exponential Dispersion Family

{\L}ukasz Delong, Mario W\"uthrich

PDF

TL;DR

This paper introduces a universal inference method for testing the calibration of mean estimates within the exponential dispersion family, offering finite sample guarantees and improved power over classical tests.

Contribution

It develops a sub-sampled split likelihood ratio test with universal validity and proposes a new test statistic to enhance calibration testing performance.

Findings

01

The proposed test achieves high power in detecting miscalibration.

02

It provides finite sample guarantees with universally valid critical values.

03

The method outperforms classical likelihood ratio tests in numerical experiments.

Abstract

Calibration of mean estimates for predictions is a crucial property in many applications, particularly in the fields of financial and actuarial decision-making. In this paper, we first review classical approaches for validating mean-calibration, and we discuss the Likelihood Ratio Test (LRT) within the Exponential Dispersion Family (EDF). Then, we investigate the framework of universal inference to test for mean-calibration. We develop a sub-sampled split LRT within the EDF that provides finite sample guarantees with universally valid critical values. We investigate type I error, power and e-power of this sub-sampled split LRT, we compare it to the classical LRT, and we propose a novel test statistics based on the sub-sampled split LRT to enhance the performance of the calibration test. A numerical analysis verifies that our proposal is an attractive alternative to the classical LRT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.