Source or It Didn't Happen: A Multi-Agent Framework for Citation Hallucination Detection

Mingzhe Li; Zhiqiang Lin; Shiqing Ma

arXiv:2605.08583·cs.CL·May 12, 2026

Source or It Didn't Happen: A Multi-Agent Framework for Citation Hallucination Detection

Mingzhe Li, Zhiqiang Lin, Shiqing Ma

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces CiteTracer, a multi-agent framework for detecting citation hallucinations in scientific writing by leveraging structured taxonomy, evidence retrieval, and class-specific adjudication.

Contribution

It presents a novel taxonomy-aligned adjudication framework and a multi-agent detector that significantly improves citation hallucination detection accuracy.

Findings

01

CiteTracer achieves 97.1% accuracy on synthetic benchmarks.

02

Detects 97.1% of fabrications in real-world citations.

03

Outperforms existing detectors in class-level F1 scores.

Abstract

Large language models are increasingly used in scientific writing, yet they can fabricate citation-shaped references that appear plausible but fail bibliographic verification. Existing detectors often reduce verification to binary found/not-found decisions and rely on brittle parsing or incomplete retrieval, offering little field-level signal to auditors. We reframe citation hallucination detection as taxonomy-aligned field-level adjudication and introduce a 12-code taxonomy spanning Real, Potential, and Hallucinated citations. Based on this taxonomy, we build CiteTracer, a cascading multi-agent detector that extracts structured citations from PDF and BibTeX, retrieves evidence through cache lookup, URL fetch, scholar connectors, and web search, applies deterministic field matching, and routes ambiguous cases to class-specialist judgers. We release a benchmark of 2,450 synthetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aaFrostnova/CiteTracer
github

Datasets

Afrostnova/Hallucinated_Citation
dataset· 229 dl
229 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.