LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation

Yang Sun; Zhiyong Xie; Lixin Zou; Dan Luo; Min Tang; Xiangyu Zhao; Yunwei Zhao; Xixun Lin; Yanxiong Lu; Chenliang Li

arXiv:2508.19614·cs.CL·January 8, 2026

LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation

Yang Sun, Zhiyong Xie, Lixin Zou, Dan Luo, Min Tang, Xiangyu Zhao, Yunwei Zhao, Xixun Lin, Yanxiong Lu, Chenliang Li

PDF

TL;DR

This paper introduces Layer Fused Decoding (LFD), a novel decoding method that enhances retrieval-augmented generation by combining intermediate layer representations with final outputs, leveraging external knowledge more effectively.

Contribution

It proposes a layer-specific analysis of LLMs, introduces LFD for improved external knowledge integration, and develops an internal knowledge score to select optimal layers.

Findings

01

LFD improves generation quality with minimal additional cost.

02

Layer analysis reveals shallow, intermediate, and deep layers' distinct roles.

03

Optimal layer selection via IKS enhances external knowledge utilization.

Abstract

Retrieval-augmented generation (RAG) incorporates external knowledge into large language models (LLMs), improving their adaptability to downstream tasks and enabling information updates. Surprisingly, recent empirical evidence demonstrates that injecting noise into retrieved relevant documents paradoxically facilitates exploitation of external knowledge and improves generation quality. Although counterintuitive and challenging to apply in practice, this phenomenon enables granular control and rigorous analysis of how LLMs integrate external knowledge. Therefore, in this paper, we intervene on noise injection and establish a layer-specific functional demarcation within the LLM: shallow layers specialize in local context modeling, intermediate layers focus on integrating long-range external factual knowledge, and deeper layers primarily rely on parametric internal knowledge. Building on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.