Probing for Phonology in Self-Supervised Speech Representations: A Case Study on Accent Perception
Nitin Venkateswaran, Kevin Tang, Ratree Wayland

TL;DR
This study investigates how self-supervised speech models encode phonological features relevant to accent perception, revealing that certain features strongly predict perceived accent strength and highlighting the models' interpretability in phonological terms.
Contribution
It demonstrates that pretrained self-supervised speech representations encode phonological features that are predictive of accent perception, providing insights into the phonological basis of accent judgments.
Findings
Accent strength correlates with specific phonological features in representations.
Pretrained models can predict accent ratings based on segmental feature distances.
Salient phonological features contrast native and non-native speech segments.
Abstract
Traditional models of accent perception underestimate the role of gradient variations in phonological features which listeners rely upon for their accent judgments. We investigate how pretrained representations from current self-supervised learning (SSL) models of speech encode phonological feature-level variations that influence the perception of segmental accent. We focus on three segments: the labiodental approximant, the rhotic tap, and the retroflex stop, which are uniformly produced in the English of native speakers of Hindi as well as other languages in the Indian sub-continent. We use the CSLU Foreign Accented English corpus (Lander, 2007) to extract, for these segments, phonological feature probabilities using Phonet (V\'asquez-Correa et al., 2019) and pretrained representations from Wav2Vec2-BERT (Barrault et al., 2023) and WavLM (Chen et al., 2022) along with accent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Linguistic Variation and Morphology · Speech Recognition and Synthesis
MethodsLogistic Regression · Focus
