Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency

Adel Ammar; Anis Koubaa; Omer Nacar; Wadii Boulila

arXiv:2505.08445·cs.LG·May 14, 2025

Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency

Adel Ammar, Anis Koubaa, Omer Nacar, Wadii Boulila

PDF

TL;DR

This paper analyzes how hyperparameters affect the speed and quality of retrieval-augmented generation systems, providing insights for optimizing performance and accuracy in practical applications.

Contribution

It offers a comprehensive evaluation of hyperparameters in RAG systems, highlighting trade-offs and identifying optimal configurations for different performance metrics.

Findings

01

Faiss yields higher retrieval precision than Chroma.

02

Naive chunking outperforms semantic segmentation in speed and quality.

03

Re-ranking improves retrieval quality but increases runtime significantly.

Abstract

Large language models achieve high task performance yet often hallucinate or rely on outdated knowledge. Retrieval-augmented generation (RAG) addresses these gaps by coupling generation with external search. We analyse how hyperparameters influence speed and quality in RAG systems, covering Chroma and Faiss vector stores, chunking policies, cross-encoder re-ranking, and temperature, and we evaluate six metrics: faithfulness, answer correctness, answer relevancy, context precision, context recall, and answer similarity. Chroma processes queries 13% faster, whereas Faiss yields higher retrieval precision, revealing a clear speed-accuracy trade-off. Naive fixed-length chunking with small windows and minimal overlap outperforms semantic segmentation while remaining the quickest option. Re-ranking provides modest gains in retrieval quality yet increases runtime by roughly a factor of 5, so…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece