A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia
Aitolkyn Baigutanova, Diego Saez-Trumper, Miriam Redi, Meeyoung Cha,, Pablo Arag\'on

TL;DR
This study analyzes over 5 million Wikipedia articles across multiple languages to evaluate the reliability and cross-lingual patterns of references, revealing persistent untrustworthy sources and cultural discrepancies in source reliability.
Contribution
It provides a large-scale cross-lingual analysis of reference reliability in Wikipedia, highlighting persistent untrustworthy sources and cultural differences in source trustworthiness.
Findings
Untrustworthy sources in one language often appear in others.
Non-authoritative sources in English persist in other languages.
Discrepancies in reference reliability across cultures are evident.
Abstract
Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively agreed upon by Wikipedia editors. We discover that some sources (or web domains) deemed untrustworthy in one language (i.e., English) continue to appear in articles in other languages. This trend is especially evident with sources tailored for smaller communities. Furthermore, non-authoritative sources found in the English version of a page tend to persist in other language versions of that page. We finally present a case study on the Chinese, Russian, and Swedish Wikipedias to demonstrate a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
