Optimizing and Evaluating Enterprise Retrieval-Augmented Generation (RAG): A Content Design Perspective
Sarah Packowski, Inge Halilovic, Jenifer Schlotfeldt, Trish Smith

TL;DR
This paper shares practical insights from building enterprise-scale RAG systems, emphasizing modular, model-agnostic strategies and highlighting the importance of content design and human-in-the-loop evaluation for effective question-answering solutions.
Contribution
It provides real-world experience and strategies for optimizing enterprise RAG systems, focusing on content design, modularity, and flexible evaluation methods beyond standard benchmarks.
Findings
Simple changes in knowledge base content greatly improve RAG performance
Model-agnostic and modular approaches enhance system flexibility
Human-in-the-loop evaluation is essential for assessing novel user questions
Abstract
Retrieval-augmented generation (RAG) is a popular technique for using large language models (LLMs) to build customer-support, question-answering solutions. In this paper, we share our team's practical experience building and maintaining enterprise-scale RAG solutions that answer users' questions about our software based on product documentation. Our experience has not always matched the most common patterns in the RAG literature. This paper focuses on solution strategies that are modular and model-agnostic. For example, our experience over the past few years - using different search methods and LLMs, and many knowledge base collections - has been that simple changes to the way we create knowledge base content can have a huge impact on our RAG solutions' success. In this paper, we also discuss how we monitor and evaluate results. Common RAG benchmark evaluation techniques have not been…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Recommender Systems and Techniques · Open Education and E-Learning
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Byte Pair Encoding · Softmax · Multi-Head Attention · WordPiece · Dropout · Layer Normalization · Adam · Attention Dropout
