AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation
Jia Fu, Xiaoting Qin, Fangkai Yang, Lu Wang, Jue Zhang, Qingwei Lin,, Yubo Chen, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

TL;DR
AutoRAG-HP introduces an online hyper-parameter tuning framework for Retrieval-Augmented Generation systems, leveraging hierarchical multi-armed bandits to efficiently optimize parameters with fewer API calls.
Contribution
It formulates hyper-parameter tuning as an online MAB problem and proposes a novel Hierarchical MAB method for efficient exploration in RAG systems.
Findings
Achieves Recall@5 ≈ 0.8 with only ~20% of API calls compared to grid search.
Hier-MAB outperforms baselines in challenging optimization scenarios.
Effective tuning of hyper-parameters like top-k, prompt compression, and embedding methods.
Abstract
Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces. We conduct extensive experiments on tuning hyper-parameters, such as top-k retrieved documents, prompt compression ratio, and embedding methods, using the ALCE-ASQA and Natural Questions datasets. Our evaluation from jointly optimization all three hyper-parameters demonstrate that MAB-based online learning methods can achieve Recall@5 for scenarios with prominent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · WordPiece · Softmax · Layer Normalization · Linear Warmup With Linear Decay · Byte Pair Encoding · Attention Dropout · Dropout
