RAGPerf: An End-to-End Benchmarking Framework for Retrieval-Augmented Generation Systems
Shaobo Li, Yirui Zhou, Yuan Xu, Kevin Chen, Daniel Waddington, Swaminathan Sundararaman, Hubertus Franke, and Jian Huang

TL;DR
RAGPerf is a comprehensive benchmarking framework that evaluates retrieval-augmented generation systems by modularly analyzing components, supporting diverse datasets, and automating performance and accuracy metrics.
Contribution
It introduces a modular, configurable benchmarking framework for RAG systems, enabling detailed performance profiling and analysis across various components and real-world scenarios.
Findings
RAGPerf incurs negligible performance overhead.
Supports diverse datasets and retrieval models.
Provides detailed metrics for system behavior analysis.
Abstract
We present the design and implementation of a RAG-based AI system benchmarking (RAGPerf) framework for characterizing the system behaviors of RAG pipelines. To facilitate detailed profiling and fine-grained performance analysis, RAGPerf decouples the RAG workflow into several modular components - embedding, indexing, retrieval, reranking, and generation. RAGPerf offers the flexibility for users to configure the core parameters of each component and examine their impact on the end-to-end query performance and quality. RAGPerf has a workload generator to model real-world scenarios by supporting diverse datasets (e.g., text, pdf, code, and audio), different retrieval and update ratios, and query distributions. RAGPerf also supports different embedding models, major vector databases such as LanceDB, Milvus, Qdrant, Chroma, and Elasticsearch, as well as different LLMs for content generation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Advanced Database Systems and Queries · Algorithms and Data Compression
