CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Kaiwen Shi; Weixiang Sun; Zheyuan Zhang; Lichao Sun; Nitesh V. Chawla; Yanfang Ye

arXiv:2602.23452·cs.CL·May 5, 2026

CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Kaiwen Shi, Weixiang Sun, Zheyuan Zhang, Lichao Sun, Nitesh V. Chawla, Yanfang Ye

PDF

1 Repo

TL;DR

CiteAudit introduces a benchmark and framework for verifying scientific references to detect hallucinated citations, combining multi-agent verification and a large human-validated dataset to improve accuracy.

Contribution

The paper presents a novel multi-agent verification pipeline and a large-scale dataset for detecting hallucinated citations, outperforming existing methods.

Findings

01

The framework achieves superior verification performance over state-of-the-art LLMs.

02

A large, human-validated dataset was constructed across diverse domains.

03

Code is publicly available at https://github.com/shiiiikw/CiteAudit.

Abstract

Scientific research relies on citation integrity, yet large language models (LLMs) have introduced a critical risk: fabricated references that appear plausible but correspond to no real publications. As manual verification becomes infeasible and existing automated tools remain fragile, we introduce CiteAudit, a comprehensive benchmark and detection framework for hallucinated citations. We design a multi-agent verification pipeline that decomposes citation checking into metadata extraction, memory lookup, web-based retrieval, and final judgment. To evaluate this, we construct a large-scale, human-validated dataset spanning diverse domains and hallucination types. Experiments demonstrate that our framework achieves superior verification performance over state-of-the-art LLMs and commercial baselines. Our work provides the necessary infrastructure to audit citations at scale and safeguard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shiiiikw/CiteAudit
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.