The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context
B. Hecht, D. Gergle

TL;DR
This paper investigates how language diversity across Wikipedia editions affects knowledge representation and discusses leveraging this diversity to develop culturally-aware and hyperlingual applications.
Contribution
It provides a comprehensive analysis of knowledge diversity in Wikipedia's multilingual editions and explores its implications for application development.
Findings
Diversity in Wikipedia's language editions exceeds previous assumptions.
Language diversity significantly impacts knowledge-based applications.
Potential for creating culturally-aware and hyperlingual applications.
Abstract
This study explores language's fragmenting effect on user-generated content by examining the diversity of knowledge representations across 25 different Wikipedia language editions. This diversity is measured at two levels: the concepts that are included in each edition and the ways in which these concepts are described. We demonstrate that the diversity present is greater than has been presumed in the literature and has a significant influence on applications that use Wikipedia as a source of world knowledge. We close by explicating how knowledge diversity can be beneficially leveraged to create "culturally-aware applications" and "hyperlingual applications".
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration
