Predictive Uncertainty Quantification with Missing Covariates
Margaux Zaffran, Julie Josse, Yaniv Romano, Aymeric Dieuleveut

TL;DR
This paper addresses the challenge of quantifying predictive uncertainty with missing covariates, proposing a flexible framework that provides valid predictive sets conditioned on missing data patterns, supported by theoretical and experimental results.
Contribution
It introduces CP-MDA-Nested*, a novel method for predictive uncertainty quantification with missing covariates, under certain independence assumptions, and explores the fundamental limits of distribution-free approaches.
Findings
Predictive uncertainty increases with more missing data.
CP-MDA-Nested* provides valid predictive sets under independence assumptions.
Experimental results show promising performance beyond theoretical scope.
Abstract
Predictive uncertainty quantification is crucial in decision-making problems. We investigate how to adequately quantify predictive uncertainty with missing covariates. A bottleneck is that missing values induce heteroskedasticity on the response's predictive distribution given the observed covariates. Thus, we focus on building predictive sets for the response that are valid conditionally to the missing values pattern. We show that this goal is impossible to achieve informatively in a distribution-free fashion, and we propose useful restrictions on the distribution class. Motivated by these hardness results, we characterize how missing values and predictive uncertainty intertwine. Particularly, we rigorously formalize the idea that the more missing values, the higher the predictive uncertainty. Then, we introduce a generalized framework, coined CP-MDA-Nested*, outputting predictive sets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
