Generating Diverse Q&A Benchmarks for RAG Evaluation with DataMorgana
Simone Filice, Guy Horowitz, David Carmel, Zohar Karnin, Liane, Lewin-Eytan, Yoelle Maarek

TL;DR
DataMorgana is a new tool that generates highly customizable, diverse synthetic Q&A benchmarks for RAG systems, improving over existing methods in lexical, syntactic, and semantic diversity for domain-specific and general knowledge data.
Contribution
It introduces DataMorgana, a lightweight, configurable tool for creating diverse, realistic Q&A benchmarks tailored to RAG applications, with efficient two-stage generation process.
Findings
DataMorgana outperforms existing tools in diversity metrics.
Generated benchmarks reflect realistic user interaction scenarios.
Tool is efficient and customizable for various domains.
Abstract
Evaluating Retrieval-Augmented Generation (RAG) systems, especially in domain-specific contexts, requires benchmarks that address the distinctive requirements of the applicative scenario. Since real data can be hard to obtain, a common strategy is to use LLM-based methods to generate synthetic data. Existing solutions are general purpose: given a document, they generate a question to build a Q&A pair. However, although the generated questions can be individually good, they are typically not diverse enough to reasonably cover the different ways real end-users can interact with the RAG system. We introduce here DataMorgana, a tool for generating highly customizable and diverse synthetic Q&A benchmarks tailored to RAG applications. DataMorgana enables detailed configurations of user and question categories and provides control over their distribution within the benchmark. It uses a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems · Q Methodology Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Adam · Softmax · Linear Warmup With Linear Decay · Residual Connection · Dropout · Byte Pair Encoding
