Optimized Conformal Selection: Powerful Selective Inference After Conformity Score Optimization
Tian Bai, Ying Jin

TL;DR
This paper introduces OptCS, a flexible framework for valid conformal inference after data-driven model optimization, addressing challenges of model selection and power loss in conformal prediction.
Contribution
It proposes a general method for valid conformal p-value construction post-model optimization, enabling multiple model selection strategies with finite-sample FDR control.
Findings
OptCS effectively controls FDR in various model selection scenarios.
The methods improve power without sacrificing validity in simulations.
Applications demonstrate practical benefits in drug discovery and radiology report generation.
Abstract
Model selection/optimization in conformal inference is challenging, since it may break the exchangeability between labeled and unlabeled data. We study this problem in the context of conformal selection, which uses conformal p-values to select ``interesting'' instances with large unobserved labels from a pool of unlabeled data, while controlling the FDR in finite sample. For validity, existing solutions require the model choice to be independent of the data used to construct the p-values and calibrate the selection set. However, when presented with many model choices and limited labeled data, it is desirable to (i) select the best model in a data-driven manner, and (ii) mitigate power loss due to sample splitting. This paper presents OptCS, a general framework that allows valid statistical testing (selection) after flexible data-driven model optimization. We introduce general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
