Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation

Yuhao Wang; Ruiyang Ren; Yucheng Wang; Wayne Xin Zhao; Jing Liu; Hua Wu; Haifeng Wang

arXiv:2505.11995·cs.CL·May 20, 2025

Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation

Yuhao Wang, Ruiyang Ren, Yucheng Wang, Wayne Xin Zhao, Jing Liu, Hua Wu, Haifeng Wang

PDF

Open Access

TL;DR

This paper systematically investigates how large language models (LLMs) utilize internal and external knowledge in retrieval-augmented generation, revealing four key stages and introducing a new neuron identification method to enhance interpretability.

Contribution

It introduces a knowledge stream analysis framework, decomposes the knowledge utilization process into four stages, and proposes KAPE for neuron identification, advancing understanding of LLM knowledge integration.

Findings

01

Knowledge streaming occurs in four stages: refinement, elicitation, expression, and contestation.

02

Passage relevance influences the knowledge streaming process.

03

Deactivation of specific neurons shifts reliance between internal and external knowledge.

Abstract

Considering the inherent limitations of parametric knowledge in large language models (LLMs), retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope. Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility. Despite this progress, the underlying knowledge utilization mechanisms of LLM-based RAG remain underexplored. In this paper, we present a systematic investigation of the intrinsic mechanisms by which LLMs integrate internal (parametric) and external (retrieved) knowledge in RAG scenarios. Specially, we employ knowledge stream analysis at the macroscopic level, and investigate the function of individual modules at the microscopic level. Drawing on knowledge streaming analyses, we decompose the knowledge utilization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Layer Normalization · Softmax · Attention Dropout · WordPiece · Residual Connection · Linear Layer · Byte Pair Encoding