Active, anytime-valid risk controlling prediction sets

Ziyu Xu; Nikos Karampatziakis; Paul Mineiro

arXiv:2406.10490·stat.ML·November 1, 2024

Active, anytime-valid risk controlling prediction sets

Ziyu Xu, Nikos Karampatziakis, Paul Mineiro

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper extends risk controlling prediction sets to an anytime-valid, sequential setting, enabling adaptive data collection with guarantees, and introduces active labeling strategies that optimize label usage and utility.

Contribution

It introduces a framework for risk controlling prediction sets in sequential, adaptive data collection, with active labeling and utility optimization, providing theoretical guarantees and practical algorithms.

Findings

01

Guarantees hold at all time steps in sequential data collection.

02

Active labeling policies reduce label usage while maintaining utility.

03

Empirical results show improved label efficiency over baselines.

Abstract

Rigorously establishing the safety of black-box machine learning models concerning critical risk measures is important for providing guarantees about model behavior. Recently, Bates et. al. (JACM '24) introduced the notion of a risk controlling prediction set (RCPS) for producing prediction sets that are statistically guaranteed low risk from machine learning models. Our method extends this notion to the sequential setting, where we provide guarantees even when the data is collected adaptively, and ensures that the risk guarantee is anytime-valid, i.e., simultaneously holds at all time steps. Further, we propose a framework for constructing RCPSes for active labeling, i.e., allowing one to use a labeling policy that chooses whether to query the true label for each received data point and ensures that the expected proportion of data points whose labels are queried are below a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neilzxu/active-rcps
pytorchOfficial

Videos

Active, anytime-valid risk controlling prediction sets· slideslive

Taxonomy

TopicsFault Detection and Control Systems · Machine Learning and Data Classification · Advanced Control Systems Optimization

MethodsSparse Evolutionary Training