TRACE: Trajectory Correction from Cross-layer Evidence for Hallucination Reduction
Tej Sanibh Ranade

TL;DR
TRACE is a training-free, inference-time algorithm that improves hallucination correction in large language models by leveraging cross-layer evidence, significantly enhancing factual accuracy across multiple models and benchmarks.
Contribution
The paper introduces TRACE, a novel inference-time method that dynamically corrects hallucinations using internal model evidence without additional training or external data.
Findings
TRACE improves factual accuracy across 15 models and 3 benchmarks.
It achieves mean gains of +12.26 MC1 and +8.65 MC2 points.
Gains reach up to +47.20 MC1 and +43.38 MC2 points.
Abstract
Hallucination correction is not a one-direction problem. We show that intermediate layers are neither uniformly more truthful than final layers nor uniformly less trustworthy. Yet hallucination reduction is usually instantiated through one fixed intervention form: contrast one layer against another, steer along a truthfulness direction, or defer to external evidence. This framing is structurally incomplete. Cross-layer factual evidence does not evolve uniformly: in some failures truthful support is present internally and later suppressed, whereas in others candidate competition remains genuinely multi-directional across depth, so no single signed scalar family is generally sufficient. We introduce Trajectory Correction from Cross-layer Evidence for Hallucination Reduction (TRACE), a deterministic, training-free algorithm which corrects hallucinations at inference time by deriving both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
