Global Patterns of Knowledge: Language, Genre, and the Geography of Knowledge
Akira Matsui, Fujio Toriumi, Mitsuo Yoshida, Taichi Murayama, Shiori Hironaka

TL;DR
This paper uses economic complexity analysis to explore how different language communities contribute to Wikipedia, revealing diverse knowledge production modes influenced by cultural and geopolitical factors, which impact AI data biases.
Contribution
It introduces a novel application of economic complexity analysis to Wikipedia editing history, uncovering the diversity and geopolitical influences on knowledge production across languages.
Findings
Distinct cultural specializations in language communities
Knowledge production modes reflect geopolitical boundaries
Standardized topics show common production patterns
Abstract
Online platforms, particularly Wikipedia, have become critical infrastructures for providing diverse linguistic and cultural contexts. This human-curated knowledge now forms the foundation for modern AI. However, we have not yet fully explored how knowledge production capability vary across languages and domains. Here, we address this gap by applying economic complexity analysis to understand the editing history of Wikipedia platforms. This approach allows us to infer the latent mode of ``knowledge-production'' of each language community from the diversity and specialization of its contributed content. We reveal that different language communities exhibit distinct specializations, particularly in cultural subjects. Furthermore, we map the global landscape of these production modes, finding that the structure of knowledge production strongly reflects geopolitical boundaries. Our findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
