Quantifying Language Variation Acoustically with Few Resources

Martijn Bartelds; Martijn Wieling

arXiv:2205.02694·cs.CL·May 26, 2022

Quantifying Language Variation Acoustically with Few Resources

Martijn Bartelds, Martijn Wieling

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that deep acoustic models can effectively distinguish regional dialects using minimal resources, outperforming transcription-based methods without needing phonetic transcriptions.

Contribution

It shows that pre-trained and fine-tuned wav2vec 2.0 models can quantify language variation acoustically with very limited data, especially in low-resource dialects.

Findings

01

Acoustic models outperform transcription-based approaches.

02

Multilingual XLSR-53 fine-tuned on Dutch yields best results.

03

Effective clustering achieved with only six seconds of speech.

Abstract

Deep acoustic models represent linguistic information based on massive amounts of data. Unfortunately, for regional languages and dialects such resources are mostly not available. However, deep acoustic models might have learned linguistic information that transfers to low-resource languages. In this study, we evaluate whether this is the case through the task of distinguishing low-resource (Dutch) regional varieties. By extracting embeddings from the hidden layers of various wav2vec 2.0 models (including new models which are pre-trained and/or fine-tuned on Dutch) and using dynamic time warping, we compute pairwise pronunciation differences averaged over 10 words for over 100 individual dialects from four (regional) languages. We then cluster the resulting difference matrix in four groups and compare these to a gold standard, and a partitioning on the basis of comparing phonetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bartelds/language-variation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Natural Language Processing Techniques