From sunblock to softblock: Analyzing the correlates of neology in published writing and on social media
Maria Ryskina, Matthew R. Gormley, Kyle Mahowald, David R. Mortensen, Taylor Berg-Kirkpatrick, Vivek Kulkarni

TL;DR
This study compares how new words emerge in published texts and social media, revealing similar factors but domain-specific differences in neologism formation mechanisms using distributional semantics and contextual embeddings.
Contribution
It extends prior work by applying distributional semantic analysis with contextual embeddings to social media, demonstrating domain-specific differences in neologism creation.
Findings
Same correlates of neology in both domains
Topic popularity influences neologism formation less on Twitter
Different mechanisms drive neologism creation in social media and published texts
Abstract
Living languages are shaped by a host of conflicting internal and external evolutionary pressures. While some of these pressures are universal across languages and cultures, others differ depending on the social and conversational context: language use in newspapers is subject to very different constraints than language use on social media. Prior distributional semantic work on English word emergence (neology) identified two factors correlated with creation of new words by analyzing a corpus consisting primarily of historical published texts (Ryskina et al., 2020, arXiv:2001.07740). Extending this methodology to contextual embeddings in addition to static ones and applying it to a new corpus of Twitter posts, we show that the same findings hold for both domains, though the topic popularity growth factor may contribute less to neology on Twitter than in published writing. We hypothesize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistics, Language Diversity, and Identity · Digital Communication and Language · Language and cultural evolution
