Parametric Retrieval Augmented Generation
Weihang Su, Yichen Tang, Qingyao Ai, Junxi Yan, Changyue Wang,, Hongning Wang, Ziyi Ye, Yujia Zhou, Yiqun Liu

TL;DR
Parametric RAG introduces a novel method to embed external knowledge directly into LLM parameters, improving efficiency and effectiveness over traditional in-context retrieval methods, and can be combined for even better performance.
Contribution
The paper proposes Parametric RAG, a new paradigm that integrates external knowledge into LLMs' parameters, overcoming limitations of input-level knowledge injection.
Findings
Significantly improves knowledge augmentation effectiveness.
Reduces computational costs by eliminating the need for document retrieval during inference.
Can be combined with in-context RAG for enhanced performance.
Abstract
Retrieval-augmented generation (RAG) techniques have emerged as a promising solution to enhance the reliability of large language models (LLMs) by addressing issues like hallucinations, outdated knowledge, and domain adaptation. In particular, existing RAG methods append relevant documents retrieved from external corpus or databases to the input of LLMs to guide their generation process, which we refer to as the in-context knowledge injection method. While this approach is simple and often effective, it has inherent limitations. Firstly, increasing the context length and number of relevant documents can lead to higher computational overhead and degraded performance, especially in complex reasoning tasks. More importantly, in-context knowledge injection operates primarily at the input level, but LLMs store their internal knowledge in their parameters. This gap fundamentally limits the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Softmax · Linear Warmup With Linear Decay · Adam · Residual Connection · Dropout · Byte Pair Encoding
