Conformal Prediction with Corrupted Labels: Uncertain Imputation and Robust Re-weighting

Shai Feldman; Stephen Bates; Yaniv Romano

arXiv:2505.04733·cs.LG·February 27, 2026

Conformal Prediction with Corrupted Labels: Uncertain Imputation and Robust Re-weighting

Shai Feldman, Stephen Bates, Yaniv Romano

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper develops robust conformal prediction methods for datasets with corrupted labels, introducing uncertain imputation and re-weighting techniques that maintain valid uncertainty quantification despite data corruptions.

Contribution

It proposes new conformal prediction methods that handle corrupted labels using uncertain imputation and robust re-weighting, with theoretical guarantees and empirical validation.

Findings

01

Robust conformal prediction remains valid with poorly estimated weights.

02

Uncertain imputation effectively preserves label uncertainty.

03

The triply robust framework guarantees valid predictions under multiple conditions.

Abstract

We introduce a framework for robust uncertainty quantification in situations where labeled training data are corrupted, through noisy or missing labels. We build on conformal prediction, a statistical tool for generating prediction sets that cover the test label with a pre-specified probability. The validity of conformal prediction, however, holds under the i.i.d assumption, which does not hold in our setting due to the corruptions in the data. To account for this distribution shift, the privileged conformal prediction (PCP) method proposed leveraging privileged information (PI) -- additional features available only during training -- to re-weight the data distribution, yielding valid prediction sets under the assumption that the weights are accurate. In this work, we analyze the robustness of PCP to inaccuracies in the weights. Our analysis indicates that PCP can still yield valid…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

First, the authors investigate the theoretical properties of the PCP method when the distribution shift$w(z)$deviates from the true value, thereby enriching the existing theoretical results. Second, the authors propose a novel UI method. By incorporating the idea of imputation, this method constructs reliable and effective prediction sets for test points without requiring the shift$w(z)$. The paper is clearly written and provides a new solution for constructing prediction sets using datasets wit

Weaknesses

The paper could be strengthened by a more thorough discussion of the proposed UI method. For instance, Theorem 4 assumes that "the residual errors are independent of the predictions of $g^{*}$ and of$C^{UI}$given the PI $Z$." It would be important to clarify the practical scenarios in which this condition can be reasonably expected to hold. Furthermore, there is no clear evidence that the UI method consistently outperforms the PCP method. When facing a practical problem, what characteristics sho

Reviewer 02Rating 6Confidence 3

Strengths

1- The paper presents a solid and technically sound theoretical analysis of conformal prediction methods under label corruption, with clear insights into how PCP and WCP behave when weight estimates are inaccurate. 2- The manuscript is clearly written, logically structured, and easy to follow despite the technical content.

Weaknesses

Please check questions!

Reviewer 03Rating 6Confidence 3

Strengths

1. The problem is applicable in many cases when there is any type of label noise. 2. The authors approached the problem in an organized way. They clearly break down the cases where the labels are faulty, and address each case separately. 3. The theoretical contribution of the paper is considerable. In total while the problem is not clearly defined and solved I think the contribution in theory is above the standard for acceptance.

Weaknesses

1. In general I could not connect the theorems in Section 3.1 to derive a robustness guarantee and a well delivered understanding of what the procedure is and what the guarantee would be. This is a shortcoming in the application as it is not clear what assumption holds. 2. Minor writing points: Line 176 the term indicator is used twice, maybe you can drop the second one. 3. Why the authors even discuss the setup with the constant noise on the weights? Isn’t it too unrealistic? 4. The definition

Code & Models

Repositories

Shai128/ui
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis