AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for   Retrieval-Augmented Generation

Jia Fu; Xiaoting Qin; Fangkai Yang; Lu Wang; Jue Zhang; Qingwei Lin,; Yubo Chen; Dongmei Zhang; Saravan Rajmohan; Qi Zhang

arXiv:2406.19251·cs.CL·June 28, 2024

AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

Jia Fu, Xiaoting Qin, Fangkai Yang, Lu Wang, Jue Zhang, Qingwei Lin,, Yubo Chen, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

PDF

Open Access

TL;DR

AutoRAG-HP introduces an online hyper-parameter tuning framework for Retrieval-Augmented Generation systems, leveraging hierarchical multi-armed bandits to efficiently optimize parameters with fewer API calls.

Contribution

It formulates hyper-parameter tuning as an online MAB problem and proposes a novel Hierarchical MAB method for efficient exploration in RAG systems.

Findings

01

Achieves Recall@5 ≈ 0.8 with only ~20% of API calls compared to grid search.

02

Hier-MAB outperforms baselines in challenging optimization scenarios.

03

Effective tuning of hyper-parameters like top-k, prompt compression, and embedding methods.

Abstract

Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces. We conduct extensive experiments on tuning hyper-parameters, such as top-k retrieved documents, prompt compression ratio, and embedding methods, using the ALCE-ASQA and Natural Questions datasets. Our evaluation from jointly optimization all three hyper-parameters demonstrate that MAB-based online learning methods can achieve Recall@5 $\approx 0.8$ for scenarios with prominent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · WordPiece · Softmax · Layer Normalization · Linear Warmup With Linear Decay · Byte Pair Encoding · Attention Dropout · Dropout