Are foundation models for computer vision good conformal predictors?

Leo Fillioux; Julio Silva-Rodr\'iguez; Ismail Ben Ayed; Paul-Henry Courn\`ede; Maria Vakalopoulou; Stergios Christodoulidis; Jose Dolz

arXiv:2412.06082·cs.CV·February 17, 2026

Are foundation models for computer vision good conformal predictors?

Leo Fillioux, Julio Silva-Rodr\'iguez, Ismail Ben Ayed, Paul-Henry Courn\`ede, Maria Vakalopoulou, Stergios Christodoulidis, Jose Dolz

PDF

Open Access

TL;DR

This paper evaluates the uncertainty quantification capabilities of vision foundation models under Conformal Prediction, revealing their suitability, the impact of calibration, and the promise of APS in real-world scenarios.

Contribution

It provides a comprehensive empirical analysis of vision foundation models with conformal prediction, highlighting the effectiveness of Vision Transformers and APS for reliable uncertainty estimation.

Findings

01

Foundation models are suitable for conformalization, especially Vision Transformers.

02

Calibrating confidence predictions can degrade conformal set efficiency.

03

APS is promising for maintaining coverage guarantees in vision models.

Abstract

Recent advances in self-supervision and contrastive learning have brought the performance of foundation models to unprecedented levels in a variety of tasks. Fueled by this progress, these models are becoming the prevailing approach for a wide array of real-world vision problems, including risk-sensitive and high-stakes applications. However, ensuring safe deployment in these scenarios requires a more comprehensive understanding of their uncertainty modeling capabilities, which has received little attention. In this work, we delve into the behaviour of vision and vision-language foundation models under Conformal Prediction (CP), a statistical framework that provides theoretical guarantees of marginal coverage of the true class. Across extensive experiments including popular vision classification benchmarks, well-known foundation vision models, and three CP methods, our findings reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection

MethodsSparse Evolutionary Training