What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur Absar Chowdhury, Nadir Durrani, Ahmed Ali

TL;DR
This study investigates what speaker, language, and channel information deep speech models learn, revealing how these properties are represented across layers and neurons, and identifying biases and minimal informative subsets.
Contribution
It provides a detailed layer-wise and neuron-level analysis of pretrained speech models, uncovering how various speaker and language properties are encoded and distributed.
Findings
Channel and gender info are distributed across the network.
Information is redundantly available in neurons for a task.
Dialectal info is encoded only in task-oriented pretrained networks.
Abstract
Deep neural networks are inherently opaque and challenging to interpret. Unlike hand-crafted feature-based models, we struggle to comprehend the concepts learned and how they interact within these models. This understanding is crucial not only for debugging purposes but also for ensuring fairness in ethical decision-making. In our study, we conduct a post-hoc functional interpretability analysis of pretrained speech models using the probing framework [1]. Specifically, we analyze utterance-level representations of speech models trained for various tasks such as speaker recognition and dialect identification. We conduct layer and neuron-wise analyses, probing for speaker, language, and channel properties. Our study aims to answer the following questions: i) what information is captured within the representations? ii) how is it represented and distributed? and iii) can we identify a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques
