Post-training an LLM for RAG? Train on Self-Generated Demonstrations

Matthew Finlayson; Ilia Kulikov; Daniel M. Bikel; Barlas Oguz; Xilun; Chen; Aasish Pappu

arXiv:2502.10596·cs.CL·March 4, 2025

Post-training an LLM for RAG? Train on Self-Generated Demonstrations

Matthew Finlayson, Ilia Kulikov, Daniel M. Bikel, Barlas Oguz, Xilun, Chen, Aasish Pappu

PDF

Open Access

TL;DR

This paper introduces a self-generated demonstration training method for RAG-enabled LLMs, improving knowledge-based question answering and avoiding issues like hallucinations caused by out-of-distribution training data.

Contribution

The paper proposes a novel training approach using self-generated demonstrations to enhance RAG performance and prevent model degradation.

Findings

01

Improves LLM handling of retrievals in QA tasks

02

Prevents model hallucinations and degradation

03

Outperforms conventional RAG fine-tuning methods

Abstract

Large language models (LLMs) often struggle with knowledge intensive NLP tasks, such as answering "Who won the latest World Cup?" because the knowledge they learn during training may be insufficient or outdated. Conditioning generation on retrieved documents -- a technique known as retrieval augmented generation (RAG) -- mitigates these shortcomings by allowing the model to leverage in-context information. Practitioners can improve LLM RAG performance by fine-tuning on retrieval-augmented instructions, but must beware that this can cause undesirable model behaviors like hallucinations. We attribute this degradation to the fact that the training data is likely to be out-of-distribution for the model and may suffer from quality issues, such as misalignment between retrievals and target responses (since retrievals are frequently added post-hoc). We propose a recipe for training RAG-enabled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Adam · Softmax · Dropout · Weight Decay · BART · WordPiece · Layer Normalization