DialectGram: Detecting Dialectal Variation at Multiple Geographic Resolutions
Hang Jiang, Haoshen Hong, Yuxing Chen, Vivek Kulkarni

TL;DR
DialectGram is a novel model that detects dialectal variation across multiple geographic resolutions without prior region definitions, learns dialect-sensitive embeddings, and models sense proportions, enabling flexible post-hoc analysis.
Contribution
It introduces DialectGram, a one-time trained model that detects dialectal variation at multiple geographic levels and explicitly models senses, unlike prior region-specific models.
Findings
Effectively models linguistic variation across regions
Requires only one training phase for multiple resolutions
Outperforms baselines on DialectSim dataset
Abstract
Several computational models have been developed to detect and analyze dialect variation in recent years. Most of these models assume a predefined set of geographical regions over which they detect and analyze dialectal variation. However, dialect variation occurs at multiple levels of geographic resolution ranging from cities within a state, states within a country, and between countries across continents. In this work, we propose a model that enables detection of dialectal variation at multiple levels of geographic resolution obviating the need for a-priori definition of the resolution level. Our method DialectGram, learns dialect-sensitive word embeddings while being agnostic of the geographic resolution. Specifically it only requires one-time training and enables analysis of dialectal variation at a chosen resolution post-hoc -- a significant departure from prior models which need…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Linguistic Variation and Morphology · Web Data Mining and Analysis
