Diverse And Private Synthetic Datasets Generation for RAG evaluation: A multi-agent framework
Ilias Driouich, Hongliu Cao, Eoin Thomas

TL;DR
This paper presents a multi-agent framework for generating diverse, privacy-preserving synthetic datasets to improve the evaluation of retrieval-augmented generation systems, addressing current limitations in dataset quality and privacy concerns.
Contribution
It introduces a novel multi-agent approach combining diversity, privacy, and QA curation agents to create high-quality synthetic datasets for RAG evaluation.
Findings
Outperforms baseline methods in dataset diversity
Achieves robust privacy masking across domains
Provides a practical framework for ethical dataset generation
Abstract
Retrieval-augmented generation (RAG) systems improve large language model outputs by incorporating external knowledge, enabling more informed and context-aware responses. However, the effectiveness and trustworthiness of these systems critically depends on how they are evaluated, particularly on whether the evaluation process captures real-world constraints like protecting sensitive information. While current evaluation efforts for RAG systems have primarily focused on the development of performance metrics, far less attention has been given to the design and quality of the underlying evaluation datasets, despite their pivotal role in enabling meaningful, reliable assessments. In this work, we introduce a novel multi-agent framework for generating synthetic QA datasets for RAG evaluation that prioritize semantic diversity and privacy preservation. Our approach involves: (1) a Diversity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
