Fine-Tuning or Fine-Failing? Debunking Performance Myths in Large Language Models
Scott Barnett, Zac Brannelly, Stefanus Kurniawan, Sheng Wong

TL;DR
This paper investigates the effects of fine-tuning large language models within Retrieval-Augmented Generation systems, revealing that fine-tuning can sometimes decrease performance contrary to common expectations.
Contribution
It provides empirical evidence that fine-tuning LLMs in RAG pipelines may impair their ability to extract and utilize contextual information, challenging assumptions about fine-tuning benefits.
Findings
Fine-tuning reduced accuracy in RAG tasks across multiple domains.
Contrary to standalone LLMs, fine-tuning can degrade performance in retrieval-based systems.
Highlights the importance of validating fine-tuning effects for domain-specific applications.
Abstract
Large Language Models (LLMs) have the unique capability to understand and generate human-like text from input queries. When fine-tuned, these models show enhanced performance on domain-specific queries. OpenAI highlights the process of fine-tuning, stating: "To fine-tune a model, you are required to provide at least 10 examples. We typically see clear improvements from fine-tuning on 50 to 100 training examples, but the right number varies greatly based on the exact use case." This study extends this concept to the integration of LLMs within Retrieval-Augmented Generation (RAG) pipelines, which aim to improve accuracy and relevance by leveraging external corpus data for information retrieval. However, RAG's promise of delivering optimal responses often falls short in complex query scenarios. This study aims to specifically examine the effects of fine-tuning LLMs on their ability to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay
