Direct Prediction Set Minimization via Bilevel Conformal Classifier Training
Yuanjie Shi, Hooman Shahrokhi, Xuesong Jia, Xiongzhi Chen, Janardhan Rao Doppa, Yan Yan

TL;DR
This paper introduces DPSM, a bilevel optimization approach that directly minimizes prediction set sizes in conformal prediction, leading to more concise and practical uncertainty quantification in deep classifiers.
Contribution
The paper formulates conformal training as a bilevel optimization problem and proposes DPSM, a novel algorithm that improves prediction set size minimization with theoretical guarantees.
Findings
DPSM reduces prediction set size by over 20% compared to baselines.
DPSM has a learning bound of O(1/√n), better than prior methods.
Experiments validate DPSM's effectiveness across datasets and models.
Abstract
Conformal prediction (CP) is a promising uncertainty quantification framework which works as a wrapper around a black-box classifier to construct prediction sets (i.e., subset of candidate classes) with provable guarantees. However, standard calibration methods for CP tend to produce large prediction sets which makes them less useful in practice. This paper considers the problem of integrating conformal principles into the training process of deep classifiers to directly minimize the size of prediction sets. We formulate conformal training as a bilevel optimization problem and propose the {\em Direct Prediction Set Minimization (DPSM)} algorithm to solve it. The key insight behind DPSM is to minimize a measure of the prediction set size (upper level) that is conditioned on the learned quantile of conformity scores (lower level). We analyze that DPSM has a learning bound of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis
