MasterSet: A Large-Scale Benchmark for Must-Cite Citation Recommendation in the AI/ML Literature
Md Toyaha Rahman Ratul, Zhiqian Chen, Kaiqun Fu, Taoran Ji, and Lei Zhang

TL;DR
MasterSet is a large-scale benchmark designed to evaluate the ability of citation recommendation systems to identify essential 'must-cite' papers in AI/ML literature, addressing a critical gap in existing tools.
Contribution
We introduce MasterSet, a comprehensive dataset with annotated must-cite papers and a benchmark task for evaluating must-cite citation recommendation methods in AI/ML.
Findings
Baseline methods show that must-cite retrieval remains challenging.
The benchmark includes over 150,000 papers from top AI/ML conferences.
Annotations include relevance, baseline status, and mention frequency.
Abstract
The explosive growth of AI and machine learning literature -- with venues like NeurIPS and ICLR now accepting thousands of papers annually -- has made comprehensive citation coverage increasingly difficult for researchers. While citation recommendation has been studied for over a decade, existing systems primarily focus on broad relevance rather than identifying the critical set of ``must-cite'' papers: direct experimental baselines, foundational methods, and core dependencies whose omission would misrepresent a contribution's novelty or undermine reproducibility. We introduce MasterSet, a large-scale benchmark specifically designed to evaluate must-cite recommendation in the AI/ML domain. MasterSet incorporates over 150,000 papers collected from official conference proceedings/websites of 15 leading venues, serving as a comprehensive candidate pool for retrieval. We annotate citations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
