Computational Sociolinguistics: A Survey
Dong Nguyen, A. Seza Do\u{g}ru\"oz, Carolyn P. Ros\'e, Franciska de, Jong

TL;DR
This survey reviews the emerging field of Computational Sociolinguistics, highlighting how large-scale data-driven methods can enhance sociolinguistic research and foster collaboration between computational linguistics and social language studies.
Contribution
It provides a comprehensive overview of CL research on sociolinguistic themes and discusses how data-driven methods can complement and inform sociolinguistic studies.
Findings
Large-scale data methods can enhance sociolinguistic analysis.
Cross-disciplinary collaboration benefits both CL and sociolinguistics.
Open challenges include integrating social context into computational models.
Abstract
Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic Variation and Morphology · Gender Studies in Language · Authorship Attribution and Profiling
