Justified or Just Convincing? Error Verifiability as a Dimension of LLM Quality
Xiaoyuan Zhu, Kimberly Le Truong, Riccardo Fogliato, Gokul Swamy, Weijian Zhang, Minglai Yang, Longtian Ye, Bangya Liu, Minghao Liu, Andrew Ilyas, Steven Wu

TL;DR
This paper introduces the concept of error verifiability as a key dimension of LLM quality, proposing a metric and methods to improve the ability to verify correctness through justifications.
Contribution
It formalizes error verifiability, proposes a balanced metric, and introduces two domain-aware methods that enhance the verifiability of LLM responses.
Findings
Common approaches do not improve verifiability.
Reflect-and-rephrase and oracle-rephrase methods improve verifiability.
Error verifiability is a distinct response quality dimension.
Abstract
As LLMs are deployed in high-stakes settings, users must judge the correctness of individual responses, often relying on model-generated justifications such as reasoning chains or explanations. Yet, no standard measure exists for whether these justifications help users distinguish correct answers from incorrect ones. We formalize this idea as error verifiability and propose , a balanced metric that measures whether justifications enable raters to accurately assess answer correctness, validated against human raters who show high agreement. We find that neither common approaches, such as post-training and model scaling, nor more targeted interventions recommended improve verifiability. We introduce two methods that succeed at improving verifiability: reflect-and-rephrase (RR) for mathematical reasoning and oracle-rephrase (OR) for factual QA, both of which improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
