Conformal Sets in Multiple-Choice Question Answering under Black-Box Settings with Provable Coverage Guarantees
Guang Yang, Xinyang Liu

TL;DR
This paper introduces a black-box conformal prediction method for multiple-choice question answering with provable coverage guarantees, improving uncertainty estimation and reliability of large language models in high-stakes domains.
Contribution
It proposes a frequency-based uncertainty quantification approach using conformal prediction that works without access to model internals, providing distribution-free coverage guarantees.
Findings
Frequency-based predictive entropy outperforms logit-based PE in AUROC.
The method effectively controls empirical miscoverage rates.
Sampling frequency serves as a viable substitute for logit probabilities in black-box settings.
Abstract
Large Language Models (LLMs) have shown remarkable progress in multiple-choice question answering (MCQA), but their inherent unreliability, such as hallucination and overconfidence, limits their application in high-risk domains. To address this, we propose a frequency-based uncertainty quantification method under black-box settings, leveraging conformal prediction (CP) to ensure provable coverage guarantees. Our approach involves multiple independent samplings of the model's output distribution for each input, with the most frequent sample serving as a reference to calculate predictive entropy (PE). Experimental evaluations across six LLMs and four datasets (MedMCQA, MedQA, MMLU, MMLU-Pro) demonstrate that frequency-based PE outperforms logit-based PE in distinguishing between correct and incorrect predictions, as measured by AUROC. Furthermore, the method effectively controls the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
