EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs
Jewon Yeom, Jaewon Sok, Seonghyeon Park, Jeongjae Park, Taesup Kim

TL;DR
EpiCaR introduces a training method for large language models that improves reasoning accuracy and calibration by explicitly teaching models when to trust their reasoning, leading to better performance and efficiency.
Contribution
The paper proposes epistemically-calibrated reasoning (EpiCaR), a novel training objective that jointly optimizes reasoning accuracy and uncertainty calibration in LLMs.
Findings
Achieves Pareto-superior accuracy and calibration on Llama-3 and Qwen-3 models.
Generalizes well to out-of-distribution mathematical reasoning and code generation tasks.
Reduces inference compute by 3X while maintaining high performance.
Abstract
Improving the reasoning abilities of large language models (LLMs) has largely relied on iterative self-training with model-generated data. While effective at boosting accuracy, existing approaches primarily reinforce successful reasoning paths, incurring a substantial calibration cost: models become overconfident and lose the ability to represent uncertainty. This failure has been characterized as a form of model collapse in alignment, where predictive distributions degenerate toward low-variance point estimates. We address this issue by reframing reasoning training as an epistemic learning problem, in which models must learn not only how to reason, but also when their reasoning should be trusted. We propose epistemically-calibrated reasoning (EpiCaR) as a training objective that jointly optimizes reasoning performance and calibration, and instantiate it within an iterative supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques
