Geometry-Calibrated Conformal Abstention for Language Models
Rui Xu, Yi Chen, Sihong Xie, Hui Xiong

TL;DR
The paper introduces a post hoc conformal abstention framework for language models that guarantees selective answering quality by calibrating prediction confidence through representation geometry.
Contribution
It adapts conformal prediction to language models for abstention, using geometry-based calibration to improve selective answering with theoretical guarantees.
Findings
Achieves 75% conditional correctness in abstention.
Improves selective answering over baseline methods.
Provides finite-sample guarantees on abstention and correctness.
Abstract
When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraining models to reward admitting ignorance can lead to overly conservative behaviors and poor generalization due to scarce evaluation benchmarks. We propose a post hoc framework, Conformal Abstention (CA), adapted from conformal prediction (CP) to determine whether to abstain from answering a query. CA provides finite-sample guarantees on both the probability of participation (i.e., not abstaining) and the probability that the generated response is correct. Importantly, the abstention decision relies on prediction confidence rather than the non-conformity scores used in CP, which are intractable for open-ended generation. To better align prediction confidence with the model's ignorance, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
