Improving Contextual Faithfulness of Large Language Models via Retrieval   Heads-Induced Optimization

Lei Huang; Xiaocheng Feng; Weitao Ma; Yuchun Fan; Xiachong Feng,; Yangfan Ye; Weihong Zhong; Yuxuan Gu; Baoxin Wang; Dayong Wu; Guoping Hu,; Bing Qin

arXiv:2501.13573·cs.CL·January 24, 2025

Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization

Lei Huang, Xiaocheng Feng, Weitao Ma, Yuchun Fan, Xiachong Feng,, Yangfan Ye, Weihong Zhong, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu,, Bing Qin

PDF

Open Access 1 Video

TL;DR

This paper introduces RHIO, a novel training framework that enhances the contextual faithfulness of retrieval-augmented large language models by discriminating between faithful and unfaithful outputs, and also presents GroundBench, a new benchmark for evaluation.

Contribution

The paper proposes RHIO, a method that explicitly teaches LLMs to distinguish faithful from unfaithful generations using retrieval head masking and contrastive decoding, and introduces GroundBench for evaluation.

Findings

01

RHIO significantly improves faithfulness in LLMs.

02

RHIO outperforms GPT-4o on GroundBench.

03

GroundBench provides a comprehensive evaluation benchmark.

Abstract

Ensuring contextual faithfulness in retrieval-augmented large language models (LLMs) is crucial for building trustworthy information-seeking systems, particularly in long-form question-answering (LFQA) scenarios. In this work, we identify a salient correlation between LFQA faithfulness and retrieval heads, a set of attention heads responsible for retrieving contextual information. Leveraging this insight, we propose RHIO, a framework designed to teach LLMs to explicitly discriminate between faithful and unfaithful generations. RHIO first augments unfaithful samples that simulate realistic model-intrinsic errors by selectively masking retrieval heads. Then, these samples are incorporated into joint training, enabling the model to distinguish unfaithful outputs from faithful ones conditioned on control tokens. Furthermore, these control tokens are leveraged to self-induce contrastive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization· underline

Taxonomy

TopicsTopic Modeling

MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training