Language-Agnostic Modeling of Source Reliability on Wikipedia
Jacopo D'Ignazi, Andreas Kaltenbrunner, Yelena Mejova, Michele Tizzani, Kyriaki Kalimeri, Mariano Beir\'o, Pablo Arag\'on

TL;DR
This paper introduces a language-agnostic model that assesses the reliability of web domains used as sources in Wikipedia, leveraging editing activity data to predict credibility across multiple languages with varying resource levels.
Contribution
The study presents a novel, language-agnostic approach to evaluate source reliability on Wikipedia using editing activity features, including a focus on domain permanence as a key predictor.
Findings
Achieves approximately 0.80 F1 score for high-resource languages
Performance drops to 0.65 for mid-resource languages
Adapting models from high-resource languages improves low-resource language performance
Abstract
Over the last few years, verifying the credibility of information sources has become a fundamental need to combat disinformation. Here, we present a language-agnostic model designed to assess the reliability of web domains as sources in references across multiple language editions of Wikipedia. Utilizing editing activity data, the model evaluates domain reliability within different articles of varying controversiality, such as Climate Change, COVID-19, History, Media, and Biology topics. Crafting features that express domain usage across articles, the model effectively predicts domain reliability, achieving an F1 Macro score of approximately 0.80 for English and other high-resource languages. For mid-resource languages, we achieve 0.65, while the performance of low-resource languages varies. In all cases, the time the domain remains present in the articles (which we dub as permanence)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
