Resources for Automated Evaluation of Assistive RAG Systems that Help Readers with News Trustworthiness Assessment

Dake Zhang; Mark D. Smucker; Charles L. A. Clarke

arXiv:2602.24277·cs.IR·March 2, 2026

Resources for Automated Evaluation of Assistive RAG Systems that Help Readers with News Trustworthiness Assessment

Dake Zhang, Mark D. Smucker, Charles L. A. Clarke

PDF

Open Access

TL;DR

This paper introduces resources and evaluation methods for assistive RAG systems designed to help readers assess news trustworthiness, including new tasks, rubrics, and an automated judging process validated against human assessments.

Contribution

The paper develops reusable evaluation resources and an automated judging process for assistive RAG systems in news trustworthiness assessment, facilitating future research and benchmarking.

Findings

01

AutoJudge correlates well with human assessments (Kendall's tau = 0.678 and 0.872).

02

Resources enable evaluation and improvement of assistive RAG systems.

03

Track tasks include question and report generation for news articles.

Abstract

Many readers today struggle to assess the trustworthiness of online news because reliable reporting coexists with misinformation. The TREC 2025 DRAGUN (Detection, Retrieval, and Augmented Generation for Understanding News) Track provided a venue for researchers to develop and evaluate assistive RAG systems that support readers' news trustworthiness assessment by producing reader-oriented, well-attributed reports. As the organizers of the DRAGUN track, we describe the resources that we have newly developed to allow for the reuse of the track's tasks. The track had two tasks: (Task 1) Question Generation, producing 10 ranked investigative questions; and (Task 2, the main task) Report Generation, producing a 250-word report grounded in the MS MARCO V2.1 Segmented Corpus. As part of the track's evaluation, we had TREC assessors create importance-weighted rubrics of questions with expected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Text Readability and Simplification · Health Literacy and Information Accessibility