Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency
Adel Ammar, Anis Koubaa, Omer Nacar, Wadii Boulila

TL;DR
This paper analyzes how hyperparameters affect the speed and quality of retrieval-augmented generation systems, providing insights for optimizing performance and accuracy in practical applications.
Contribution
It offers a comprehensive evaluation of hyperparameters in RAG systems, highlighting trade-offs and identifying optimal configurations for different performance metrics.
Findings
Faiss yields higher retrieval precision than Chroma.
Naive chunking outperforms semantic segmentation in speed and quality.
Re-ranking improves retrieval quality but increases runtime significantly.
Abstract
Large language models achieve high task performance yet often hallucinate or rely on outdated knowledge. Retrieval-augmented generation (RAG) addresses these gaps by coupling generation with external search. We analyse how hyperparameters influence speed and quality in RAG systems, covering Chroma and Faiss vector stores, chunking policies, cross-encoder re-ranking, and temperature, and we evaluate six metrics: faithfulness, answer correctness, answer relevancy, context precision, context recall, and answer similarity. Chroma processes queries 13% faster, whereas Faiss yields higher retrieval precision, revealing a clear speed-accuracy trade-off. Naive fixed-length chunking with small windows and minimal overlap outperforms semantic segmentation while remaining the quickest option. Re-ranking provides modest gains in retrieval quality yet increases runtime by roughly a factor of 5, so…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece
