Retrieval-augmented Decoding for Improving Truthfulness in Open-ended Generation
Manh Nguyen, Sunil Gupta, Hung Le

TL;DR
This paper introduces Retrieval-Augmented Decoding (RAD), a lightweight, context-aware method that improves the truthfulness of large language models during inference by leveraging a small, annotated reference space for retrieval-based logit shaping.
Contribution
RAD is a novel decoding strategy that uses a small annotated grounding space to enhance factual accuracy without retraining or extensive fine-tuning of LLMs.
Findings
RAD outperforms strong baselines across four benchmarks.
RAD demonstrates robust generalization across different tasks.
RAD requires only a few annotated examples for effective grounding.
Abstract
Ensuring truthfulness in large language models (LLMs) remains a critical challenge for reliable text generation. While supervised fine-tuning and reinforcement learning with human feedback have shown promise, they require a substantial amount of annotated data and computational resources, limiting scalability. In contrast, decoding-time interventions offer lightweight alternatives without model retraining. However, existing decoding strategies often face issues like prompt sensitivity, limited generalization, or dependence on internal model states. We propose Retrieval-Augmented Decoding (RAD), a context-aware adaptive decoding method that leverages a compact reference grounding space built from as few as 10 annotated examples and comprising pairs of context embeddings and next-token logits from truthful responses, to enable retrieval-based logit shaping during inference. At each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Text Readability and Simplification
