Statistical field theory for dialectology
James Burridge

TL;DR
This paper introduces a statistical field theory model for linguistic evolution, capturing spatial-temporal dialect changes and enabling future change prediction using large-scale survey data.
Contribution
It develops a novel statistical physics-based model that incorporates social and spatial factors to predict linguistic change at detailed levels.
Findings
Model accurately fits 20th-century US dialect data.
Bias field has a measurable half-life affecting predictability.
Surface tension influences dialect region coarsening.
Abstract
Is it possible to develop a `physics of language' which can explain the spatial, temporal and social patterns we see, and which can predict future change like we forecast the weather? Such a theory is likely to involve ideas from statistical physics. A substantial literature already applies these ideas to language. However, we lack a model which can match the spatial-temporal detail of historical changes at the level of individual linguistic features, and which offers a principled mechanism to predict the future. Here we present a statistical field theory for the evolution of linguistic variables which takes steps to fill this gap. Linguistic variant frequencies are represented as a stochastic state field with spatial interaction and social conformity, coupled to a latent bias field with Onsager Machlup action that reduces overfitting to data. We derive parameter inference procedures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic Variation and Morphology · Language and cultural evolution · Syntax, Semantics, Linguistic Variation
