Investigating the performance of Retrieval-Augmented Generation and fine-tuning for the development of AI-driven knowledge-based systems

Robert Lakatos; Peter Pollner; Andras Hajdu; Tamas Joo

arXiv:2403.09727·cs.CL·May 13, 2025·3 cites

Investigating the performance of Retrieval-Augmented Generation and fine-tuning for the development of AI-driven knowledge-based systems

Robert Lakatos, Peter Pollner, Andras Hajdu, Tamas Joo

PDF

Open Access 1 Repo

TL;DR

This paper compares Retrieval-Augmented Generation (RAG) and fine-tuning (FN) techniques for large language models, demonstrating RAG's superior efficiency and performance in knowledge-based systems, especially in reducing hallucinations.

Contribution

The study provides a comprehensive comparison of RAG and FN for multiple LLMs, highlighting RAG's advantages and proposing a simple architecture that outperforms FN in key metrics.

Findings

01

RAG outperforms FN by 16% in ROUGE score

02

RAG achieves 15% higher BLEU scores than FN

03

RAG reduces hallucinations significantly compared to FN

Abstract

The development of generative large language models (G-LLM) opened up new opportunities for the development of new types of knowledge-based systems similar to ChatGPT, Bing, or Gemini. Fine-tuning (FN) and Retrieval-Augmented Generation (RAG) are the techniques that can be used to implement domain adaptation for the development of G-LLM-based knowledge systems. In our study, using ROUGE, BLEU, METEOR scores, and cosine similarity, we compare and examine the performance of RAG and FN for the GPT-J-6B, OPT-6.7B, LlaMA, LlaMA-2 language models. Based on measurements shown on different datasets, we demonstrate that RAG-based constructions are more efficient than models produced with FN. We point out that connecting RAG and FN is not trivial, because connecting FN models with RAG can cause a decrease in performance. Furthermore, we outline a simple RAG-based architecture which, on average,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

robertlakatos/ragvsfn
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · AI-based Problem Solving and Planning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Linear Warmup With Linear Decay · Dropout · Byte Pair Encoding · Dense Connections · Adam