A Spatial Modeling Approach for Linguistic Object Data: Analysing dialect sound variations across Great Britain
Shahin Tavakoli, Davide Pigoli, John A. D. Aston, John S., Coleman

TL;DR
This paper introduces novel spatial statistical techniques for analyzing geolocalized speech recordings to map dialect sound variations across Great Britain, enabling continuous spatial analysis and acoustic reconstruction.
Contribution
It presents new methods for modeling spatial variation in speech data using $d$-covariance and spatial smoothing tailored for non-convex domains, advancing dialect mapping.
Findings
Produced detailed dialect variation maps of Great Britain.
Enabled acoustic reconstruction of dialect sounds.
Demonstrated the effectiveness of the methods on British National Corpus data.
Abstract
Dialect variation is of considerable interest in linguistics and other social sciences. However, traditionally it has been studied using proxies (transcriptions) rather than acoustic recordings directly. We introduce novel statistical techniques to analyse geolocalised speech recordings and to explore the spatial variation of pronunciations continuously over the region of interest, as opposed to traditional isoglosses, which provide a discrete partition of the region. Data of this type require an explicit modeling of the variation in the mean and the covariance. Usual Euclidean metrics are not appropriate, and we therefore introduce the concept of -covariance, which allows consistent estimation both in space and at individual locations. We then propose spatial smoothing for these objects which accounts for the possibly non convex geometry of the domain of interest. We apply the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic Variation and Morphology
