Assessing the Impact of Anisotropy in Neural Representations of Speech: A Case Study on Keyword Spotting

Guillaume Wisniewski (LLF - UMR7110); S\'everine Guillaume (LACITO); Clara Rosina Fern\'andez (LACITO)

arXiv:2506.11096·cs.SD·June 16, 2025

Assessing the Impact of Anisotropy in Neural Representations of Speech: A Case Study on Keyword Spotting

Guillaume Wisniewski (LLF - UMR7110), S\'everine Guillaume (LACITO), Clara Rosina Fern\'andez (LACITO)

PDF

Open Access

TL;DR

This paper investigates how anisotropy in pretrained speech models affects keyword spotting, demonstrating that despite anisotropy, models like wav2vec2 effectively identify words and capture phonetic structures.

Contribution

It provides the first detailed analysis of anisotropy's impact on downstream speech tasks, showing robustness of pretrained models in keyword spotting.

Findings

01

Wav2vec2 embeddings effectively identify words despite anisotropy.

02

Pretrained speech models capture phonetic structures and generalize across speakers.

03

Anisotropy does not hinder the utility of speech representations in keyword spotting.

Abstract

Pretrained speech representations like wav2vec2 and HuBERT exhibit strong anisotropy, leading to high similarity between random embeddings. While widely observed, the impact of this property on downstream tasks remains unclear. This work evaluates anisotropy in keyword spotting for computational documentary linguistics. Using Dynamic Time Warping, we show that despite anisotropy, wav2vec2 similarity measures effectively identify words without transcription. Our results highlight the robustness of these representations, which capture phonetic structures and generalize across speakers. Our results underscore the importance of pretraining in learning rich and invariant speech representations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques