DRAGOn: Designing RAG On Periodically Updated Corpus

Fedor Chernogorskii; Sergei Averkiev; Liliya Kudraleeva; Zaven Martirosian; Maria Tikhonova; Valentin Malykh; Alena Fenogenova

arXiv:2507.05713·cs.CL·February 10, 2026

DRAGOn: Designing RAG On Periodically Updated Corpus

Fedor Chernogorskii, Sergei Averkiev, Liliya Kudraleeva, Zaven Martirosian, Maria Tikhonova, Valentin Malykh, Alena Fenogenova

PDF

Open Access

TL;DR

DRAGOn presents a comprehensive framework for designing and evaluating RAG benchmarks on periodically updated corpora, including datasets, question generation, evaluation metrics, and a public leaderboard to foster community progress.

Contribution

It introduces a novel methodology for creating and maintaining RAG benchmarks with regular updates, ensuring fair comparison and reducing data leakage.

Findings

01

Effective automatic question generation from knowledge graphs

02

A diverse set of evaluation metrics for RAG systems

03

Successful deployment on Russian news datasets

Abstract

This paper introduces DRAGOn, method to design a RAG benchmark on a regularly updated corpus. It features recent reference datasets, a question generation framework, an automatic evaluation pipeline, and a public leaderboard. Specified reference datasets allow for uniform comparison of RAG systems, while newly generated dataset versions mitigate data leakage and ensure that all models are evaluated on unseen, comparable data. The pipeline for automatic question generation extracts the Knowledge Graph from the text corpus and produces multiple question-answer pairs utilizing modern LLM capabilities. A set of diverse LLM-as-Judge metrics is provided for a comprehensive model evaluation. We used Russian news outlets to form the datasets and demonstrate our methodology. We launch a public leaderboard to track the development of RAG systems and encourage community participation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Advanced Data Compression Techniques