Emo, Love, and God: Making Sense of Urban Dictionary, a Crowd-Sourced Online Dictionary
Dong Nguyen, Barbara McGillivray, Taha Yasseri

TL;DR
Urban Dictionary, a crowd-sourced online dictionary, exhibits rapid growth, diverse content including opinions and informal words, but also contains offensive material, highlighting both its utility and challenges as a language resource.
Contribution
This study combines computational analysis with qualitative annotation to characterize Urban Dictionary's growth, content types, and quality issues, providing insights into crowd-sourced language documentation.
Findings
High presence of opinion-based entries
Coverage of informal and proper nouns
Offensive content exists but is rated lower
Abstract
The Internet facilitates large-scale collaborative projects and the emergence of Web 2.0 platforms, where producers and consumers of content unify, has drastically changed the information market. On the one hand, the promise of the "wisdom of the crowd" has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. On the other hand, the decentralized and often un-monitored environment of such projects may make them susceptible to low quality content. In this work, we focus on Urban Dictionary, a crowd-sourced online dictionary. We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content. We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
