Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications

Jakob Sturm; Josef Pichlmeier; Christian Bernhard; Maka Karalashvili; Johannes Klepsch; Georg Groh; Andre Luckow

arXiv:2605.09533·cs.CL·May 12, 2026

Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications

Jakob Sturm, Josef Pichlmeier, Christian Bernhard, Maka Karalashvili, Johannes Klepsch, Georg Groh, Andre Luckow

PDF

1 Video

TL;DR

This paper compares Retrieval-Augmented Generation and fine-tuning for industrial question-answering, finding RAG to be more cost-effective and suitable for open-source models in enterprise contexts.

Contribution

It extends the Cost-of-Pass framework to evaluate RAG and fine-tuning, providing insights into their cost-accuracy trade-offs in industrial QA applications.

Findings

01

Premium models perform best out of the box.

02

Open-source models can match quality with RAG.

03

RAG is the most cost-efficient adaptation method.

Abstract

Large Language Models (LLMs) are increasingly employed in enterprise question-answering (QA) systems, requiring adaptation to domain-specific knowledge. Among the most prevalent methods for incorporating such knowledge are Retrieval-Augmented Generation (RAG) and fine-tuning (FT). Yet, from a cost-accuracy trade-off perspective, it remains unclear which approach best suits industry scenarios. This study examines the impact of RAG and FT on two closed datasets specific to the automotive industry, assessing answer quality and operational costs. We extend the Cost-of-Pass framework proposed by Erol et al. (arXiv:2504.13359) to jointly assess output quality, generation cost, and user interaction cost. Our findings reveal that while premium models perform best out of the box, open-source models can achieve comparable quality when enhanced with RAG. Overall, RAG emerges as the most effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications· underline