GRAD: Graph-Retrieved Adaptive Decoding for Hallucination Mitigation
Manh Nguyen, Sunil Gupta, Dai Do, Hung Le

TL;DR
GRAD is a decoding-time method that improves large language model factuality by grounding generation in corpus-derived evidence using a graph-based approach, without retraining.
Contribution
Introduces Graph-Retrieved Adaptive Decoding (GRAD), a novel, lightweight decoding technique that leverages corpus-derived token transition graphs to mitigate hallucinations in LLMs.
Findings
Achieves up to 9.7% higher intrinsic accuracy.
Reduces hallucination rates by 8.6%.
Improves correctness by 6.9% over greedy decoding.
Abstract
Hallucination mitigation remains a persistent challenge for large language models (LLMs), even as model scales grow. Existing approaches often rely on external knowledge sources, such as structured databases or knowledge graphs, accessed through prompting or retrieval. However, prompt-based grounding is fragile and domain-sensitive, while symbolic knowledge integration incurs heavy retrieval and formatting costs. Motivated by knowledge graphs, we introduce Graph-Retrieved Adaptive Decoding (GRAD), a decoding-time method that grounds generation in corpus-derived evidence without retraining. GRAD constructs a sparse token transition graph by accumulating next-token logits across a small retrieved corpus in a single forward pass. During decoding, graph-retrieved logits are max-normalized and adaptively fused with model logits to favor high-evidence continuations while preserving fluency.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Adversarial Robustness in Machine Learning · Topic Modeling
