Blind-Spot Mass: A Good-Turing Framework for Quantifying Deployment Coverage Risk in Machine Learning Systems

Biplab Pal; Santanu Bhattacharya; Madanjit Singh

arXiv:2604.05057·cs.LG·April 8, 2026

Blind-Spot Mass: A Good-Turing Framework for Quantifying Deployment Coverage Risk in Machine Learning Systems

Biplab Pal, Santanu Bhattacharya, Madanjit Singh

PDF

TL;DR

This paper introduces a Good-Turing based framework called blind-spot mass to quantify deployment coverage risk in machine learning, addressing the challenge of under-supported rare states in operational environments.

Contribution

It proposes a novel metric for estimating the probability mass of under-supported states and demonstrates its applicability across diverse domains like human activity recognition and clinical data.

Findings

01

Blind-spot mass converges to 95% at tau=5 across domains.

02

The framework effectively identifies dominant risk activities or regimes.

03

It provides actionable insights for targeted data collection and model safety.

Abstract

Blind-spot mass is a Good-Turing framework for quantifying deployment coverage risk in machine learning. In modern ML systems, operational state distributions are often heavy-tailed, implying that a long tail of valid but rare states is structurally under-supported in finite training and evaluation data. This creates a form of 'coverage blindness': models can appear accurate on standard test sets yet remain unreliable across large regions of the deployment state space. We propose blind-spot mass B_n(tau), a deployment metric estimating the total probability mass assigned to states whose empirical support falls below a threshold tau. B_n(tau) is computed using Good-Turing unseen-species estimation and yields a principled estimate of how much of the operational distribution lies in reliability-critical, under-supported regimes. We further derive a coverage-imposed accuracy ceiling,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.