Predictive Uncertainty Quantification with Missing Covariates

Margaux Zaffran; Julie Josse; Yaniv Romano; Aymeric Dieuleveut

arXiv:2405.15641·stat.ME·May 27, 2024

Predictive Uncertainty Quantification with Missing Covariates

Margaux Zaffran, Julie Josse, Yaniv Romano, Aymeric Dieuleveut

PDF

TL;DR

This paper addresses the challenge of quantifying predictive uncertainty with missing covariates, proposing a flexible framework that provides valid predictive sets conditioned on missing data patterns, supported by theoretical and experimental results.

Contribution

It introduces CP-MDA-Nested*, a novel method for predictive uncertainty quantification with missing covariates, under certain independence assumptions, and explores the fundamental limits of distribution-free approaches.

Findings

01

Predictive uncertainty increases with more missing data.

02

CP-MDA-Nested* provides valid predictive sets under independence assumptions.

03

Experimental results show promising performance beyond theoretical scope.

Abstract

Predictive uncertainty quantification is crucial in decision-making problems. We investigate how to adequately quantify predictive uncertainty with missing covariates. A bottleneck is that missing values induce heteroskedasticity on the response's predictive distribution given the observed covariates. Thus, we focus on building predictive sets for the response that are valid conditionally to the missing values pattern. We show that this goal is impossible to achieve informatively in a distribution-free fashion, and we propose useful restrictions on the distribution class. Motivated by these hardness results, we characterize how missing values and predictive uncertainty intertwine. Particularly, we rigorously formalize the idea that the more missing values, the higher the predictive uncertainty. Then, we introduce a generalized framework, coined CP-MDA-Nested*, outputting predictive sets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.