Split Conformal Prediction under Data Contamination
Jase Clarkson, Wenkai Xu, Mihai Cucuringu, Yvik Swan, Gesine Reinert

TL;DR
This paper investigates the robustness of split conformal prediction under data contamination, quantifies its impact on coverage and efficiency, and proposes an adjustment method for contamination robustness.
Contribution
It introduces a novel analysis of split conformal prediction's robustness to data contamination and proposes a new contamination-robust conformal prediction method.
Findings
Contamination affects coverage and efficiency of conformal prediction.
The proposed adjustment improves robustness in contaminated data scenarios.
Numerical experiments validate the effectiveness of the new method.
Abstract
Conformal prediction is a non-parametric technique for constructing prediction intervals or sets from arbitrary predictive models under the assumption that the data is exchangeable. It is popular as it comes with theoretical guarantees on the marginal coverage of the prediction sets and the split conformal prediction variant has a very low computational cost compared to model training. We study the robustness of split conformal prediction in a data contamination setting, where we assume a small fraction of the calibration scores are drawn from a different distribution than the bulk. We quantify the impact of the corrupted data on the coverage and efficiency of the constructed sets when evaluated on "clean" test points, and verify our results with numerical experiments. Moreover, we propose an adjustment in the classification setting which we call Contamination Robust Conformal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Neural Networks and Applications
