Statistical Inference for Responsiveness Verification

Seung Hyun Cheon; Meredith Stewart; Bogdan Kulynych; Tsui-Wei Weng; and Berk Ustun

arXiv:2507.02169·cs.LG·July 4, 2025

Statistical Inference for Responsiveness Verification

Seung Hyun Cheon, Meredith Stewart, Bogdan Kulynych, Tsui-Wei Weng, and Berk Ustun

PDF

TL;DR

This paper presents a formal validation method for assessing how machine learning predictions respond to feature interventions, aiming to improve safety and reliability in high-stakes applications.

Contribution

It introduces a black-box compatible sensitivity analysis framework for responsiveness verification, enabling practitioners to evaluate model robustness to feature changes.

Findings

01

Effective responsiveness estimation algorithms developed

02

Supports falsification and failure probability estimation

03

Applied to real-world safety-critical scenarios

Abstract

Many safety failures in machine learning arise when models are used to assign predictions to people (often in settings like lending, hiring, or content moderation) without accounting for how individuals can change their inputs. In this work, we introduce a formal validation procedure for the responsiveness of predictions with respect to interventions on their features. Our procedure frames responsiveness as a type of sensitivity analysis in which practitioners control a set of changes by specifying constraints over interventions and distributions over downstream effects. We describe how to estimate responsiveness for the predictions of any model and any dataset using only black-box access, and how to use these estimates to support tasks such as falsification and failure probability estimation. We develop algorithms that construct these estimates by generating a uniform sample of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.