Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation
Zhan Peng Lee, Andre Lin, Calvin Tan

TL;DR
Finetune-RAG introduces a fine-tuning method with a new dataset to enhance the factual accuracy of retrieval-augmented language models, effectively reducing hallucinations caused by retrieval errors.
Contribution
The paper presents Finetune-RAG, a novel fine-tuning approach and a real-world imperfect retrieval dataset, improving factual accuracy and providing an evaluation pipeline for RAG models.
Findings
Finetune-RAG improves factual accuracy by 21.2%.
The new dataset mimics real-world retrieval imperfections.
Bench-RAG effectively stress tests models under imperfect retrieval scenarios.
Abstract
Retrieval-Augmented Generation (RAG) has emerged as a powerful framework to improve factuality in large language models (LLMs) by grounding their outputs in retrieved documents. However, ensuring perfect retrieval of relevant information remains challenging, and when irrelevant content is passed downstream to an LLM, it can lead to hallucinations. In this work, we propose Finetune-RAG, a simple and effective fine-tuning approach that features the first-of-its-kind RAG training dataset constructed to mimic real-world imperfections. Experimental results show that Finetune-RAG improves factual accuracy by 21.2% over the base model. We also propose Bench-RAG, an LLM-as-a-judge evaluation pipeline that stress tests models under realistic imperfect retrieval scenarios. Our codebase and dataset are fully open sourced for community use.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Misinformation and Its Impacts · Mental Health via Writing
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Layer Normalization · Softmax · Attention Dropout · WordPiece · Residual Connection · Linear Layer · Byte Pair Encoding
