HIVE: Hidden-Evidence Verification for Hallucination Detection in Diffusion Large Language Models
Guoshenghui Zhao, Tan Yu, and Weijie Zhao

TL;DR
HIVE introduces a framework that leverages hidden evidence from denoising trajectories in diffusion large language models to improve hallucination detection accuracy.
Contribution
The paper presents a novel hidden-evidence verification method that outperforms existing detectors by utilizing richer trajectory information for hallucination detection.
Findings
HIVE achieves up to 0.9236 AUROC and 0.9537 AUPRC on QA benchmarks.
Hidden evidence conditioning significantly improves hallucination detection.
Selected hidden evidence provides a stronger signal than output-only uncertainty.
Abstract
Diffusion large language models generate text through multi-step denoising, where hallucination signals may emerge throughout the trajectory rather than only in the final output. Existing detectors mainly rely on output uncertainty or coarse trace statistics, which often fail to capture the richer hidden dynamics of D-LLMs. We propose HIVE, a hidden-evidence verification framework that extracts compressed hidden evidence from denoising trajectories, selects informative step-layer evidence, and conditions a verifier language model on the selected evidence through prefix embeddings. HIVE produces both a continuous hallucination score from verifier decision logits and structured verification outputs, including hallucination types, evidence pairs, and short rationales. Across two D-LLMs and three QA benchmarks, HIVE consistently outperforms eight strong baselines and achieves up to 0.9236…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
