A distribution-free valid p-value for finite samples of bounded random   variables

Joaquin Alvarez

arXiv:2405.08975·stat.ML·May 16, 2024

A distribution-free valid p-value for finite samples of bounded random variables

Joaquin Alvarez

PDF

Open Access

TL;DR

This paper introduces a new distribution-free p-value for finite samples of bounded variables, improving calibration in machine learning and classical inference by leveraging a concentration inequality, and demonstrating its tighter bounds compared to existing methods.

Contribution

It presents a novel super-uniform p-value based on a concentration inequality, offering tighter bounds than Hoeffding and Bentkus in certain regions, applicable to both machine learning and classical statistics.

Findings

01

The p-value is valid for finite samples of bounded variables.

02

It is tighter than Hoeffding and Bentkus bounds in specific regions.

03

The method enhances calibration and inference in distribution-free settings.

Abstract

We build a valid p-value based on a concentration inequality for bounded random variables introduced by Pelekis, Ramon and Wang. The motivation behind this work is the calibration of predictive algorithms in a distribution-free setting. The super-uniform p-value is tighter than Hoeffding and Bentkus alternatives in certain regions. Even though we are motivated by a calibration setting in a machine learning context, the ideas presented in this work are also relevant in classical statistical inference. Furthermore, we compare the power of a collection of valid p- values for bounded losses, which are presented in previous literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbability and Risk Models