AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation Pipeline
Dongkyu Kim, Byoungwook Kim, Donggeon Han, Matou\v{s} Eibich

TL;DR
AutoRAG is an automated framework that optimizes the selection and combination of RAG modules tailored to specific datasets, enhancing retrieval-augmented generation performance.
Contribution
It introduces AutoRAG, a novel automated system that identifies and combines the best RAG modules for any dataset, addressing the challenge of module performance variability.
Findings
AutoRAG effectively finds optimal RAG module combinations for datasets.
Experimental results demonstrate improved RAG performance with AutoRAG.
All data and results are publicly available on GitHub.
Abstract
Using LLMs (Large Language Models) in conjunction with external documents has made RAG (Retrieval-Augmented Generation) an essential technology. Numerous techniques and modules for RAG are being researched, but their performance can vary across different datasets. Finding RAG modules that perform well on specific datasets is challenging. In this paper, we propose the AutoRAG framework, which automatically identifies suitable RAG modules for a given dataset. AutoRAG explores and approximates the optimal combination of RAG modules for the dataset. Additionally, we share the results of optimizing a dataset using AutoRAG. All experimental results and data are publicly available and can be accessed through our GitHub repository https://github.com/Marker-Inc-Korea/AutoRAG_ARAGOG_Paper .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPower Systems and Technologies · Islanding Detection in Power Systems · Power System Reliability and Maintenance
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Adam · Linear Layer · Attention Dropout · Dropout · Weight Decay · Dense Connections · Byte Pair Encoding · BART · Layer Normalization
