Utilizing citation index and synthetic quality measure to compare Wikipedia languages across various topics
W{\l}odzimierz Lewoniewski, Krzysztof W\k{e}cel, Witold Abramowicz

TL;DR
This paper introduces a method combining citation indices and a synthetic quality score to compare Wikipedia language editions across various topics, revealing disparities in content coverage and quality.
Contribution
It develops a novel approach using large-scale link data and a synthetic quality measure to compare Wikipedia editions across languages and topics.
Findings
Identified top cited articles in multiple languages and topics
Revealed disparities in content coverage among Wikipedia editions
Provided a scalable method for quality comparison across languages
Abstract
This study presents a comparative analysis of 55 Wikipedia language editions employing a citation index alongside a synthetic quality measure. Specifically, we identified the most significant Wikipedia articles within distinct topical areas, selecting the top 10, top 25, and top 100 most cited articles in each topic and language version. This index was built on the basis of wikilinks between Wikipedia articles in each language version and in order to do that we processed 6.6 billion page-to-page link records. Next, we used a quality score for each Wikipedia article - a synthetic measure scaled from 0 to 100. This approach enabled quality comparison of Wikipedia articles even between language versions with different quality grading schemes. Our results highlight disparities among Wikipedia language editions, revealing strengths and gaps in content coverage and quality across topics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Information Retrieval and Search Behavior · Library Science and Information Systems
