Epistemic Regret Minimization: Label-Free Causal Critique Beyond Outcome Reward
Edward Y. Chang, Longling Geng

TL;DR
This paper introduces Epistemic Regret Minimization (ERM), a label-free framework for diagnosing and repairing causal reasoning errors in large language models by critiquing their reasoning structure rather than just their answers.
Contribution
ERM is a novel, label-free method that critiques the causal structure of model reasoning, improving causal understanding without requiring true causal graphs or correct answers.
Findings
ERM detects and repairs causal reasoning errors within a single episode.
ERM significantly improves causal reasoning accuracy across multiple LLMs.
Outcome-only correction methods underperform compared to ERM on causal tasks.
Abstract
Large language models can answer causal questions correctly for the wrong reasons. Current RL methods reward \emph{what} a model concludes but ignore \emph{why}, reinforcing correlational shortcuts -- a failure we call \emph{Reward Entrenchment}. We introduce \emph{Epistemic Regret Minimization} (\erm), a framework that critiques the causal \emph{structure} of a model's reasoning trace rather than its answer. Applying established causal principles, \erm flags unexamined confounders, correlation--intervention conflation, and unchecked back-door paths from exposed reasoning traces. The framework admits \emph{label-free} operation -- without the true causal graph or correct answer -- and we separately distinguish favorable benchmark-derived critique, error-direction cues, and fully label-free judge-generated critique in the experiments. Within a single episode, \erm detects and repairs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
