How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval
William Walden, Kathryn Ricci, Miriam Wanner, Zhengping Jiang, Chandler May, Rongkun Zhou, Benjamin Van Durme

TL;DR
This paper investigates Wikipedia's reliability by analyzing how well claims are supported by citations and introduces a new dataset to facilitate research on evidence retrieval and grounding in Wikipedia articles.
Contribution
It presents PeopleProfiles, a large-scale dataset of claim support annotations, and provides insights into citation support gaps and challenges in evidence retrieval.
Findings
22% of claims in lead sections are unsupported by the article body
30% of claims in the body lack support from accessible sources
Evidence retrieval remains challenging for recent rerankers
Abstract
Wikipedia is a critical resource for modern NLP, serving as a rich repository of up-to-date and citation-backed information on a wide variety of subjects. The reliability of Wikipedia -- its groundedness in its cited sources -- is vital to this purpose. This work analyzes both how grounded Wikipedia is and how readily fine-grained grounding evidence can be retrieved. To this end, we introduce PeopleProfiles -- a large-scale, multi-level dataset of claim support annotations on biographical Wikipedia articles. We show that: (1) ~22% of claims in Wikipedia lead sections are unsupported by the article body; (2) ~30% of claims in the article body are unsupported by their publicly accessible sources; and (3) real-world Wikipedia citation practices often differ from documented standards. Finally, we show that complex evidence retrieval remains a challenge -- even for recent reasoning rerankers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Wikis in Education and Collaboration · Advanced Graph Neural Networks
