Entropy Alone is Insufficient for Safe Selective Prediction in LLMs
Edward Phillips, Fredrik K. Gustafsson, Sean Wu, Anshul Thakur, David A. Clifton

TL;DR
This paper demonstrates that relying solely on entropy for uncertainty estimation in large language models is inadequate for safe selective prediction, and proposes a combined scoring method to improve abstention reliability across multiple benchmarks.
Contribution
The paper identifies limitations of entropy-based uncertainty methods and introduces a combined entropy and correctness probe approach to enhance selective prediction in LLMs.
Findings
Combined scoring improves risk--coverage trade-off.
Enhanced calibration performance over entropy-only methods.
Method effective across diverse benchmarks and models.
Abstract
Selective prediction systems can mitigate harms resulting from language model hallucinations by abstaining from answering in high-risk cases. Uncertainty quantification techniques are often employed to identify such cases, but are rarely evaluated in the context of the wider selective prediction policy and its ability to operate at low target error rates. We identify a model-dependent failure mode of entropy-based uncertainty methods that leads to unreliable abstention behaviour, and address it by combining entropy scores with a correctness probe signal. We find that across three QA benchmarks (TriviaQA, BioASQ, MedicalQA) and four model families, the combined score generally improves both the risk--coverage trade-off and calibration performance relative to entropy-only baselines. Our results highlight the importance of deployment-facing evaluation of uncertainty methods, using metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
