TL;DR
This paper adapts interpretability methods to analyze how acoustic and semantic information evolves in ASR systems, revealing internal dynamics and biases that can inform improvements in transparency and robustness.
Contribution
It systematically applies established interpretability techniques to ASR, uncovering new insights into internal representations and interactions within speech recognition models.
Findings
Identified encoder-decoder interactions causing repetition hallucinations
Discovered semantic biases in deep acoustic representations
Revealed internal dynamics of information flow in ASR models
Abstract
Interpretability methods have recently gained significant attention, particularly in the context of large language models, enabling insights into linguistic representations, error detection, and model behaviors such as hallucinations and repetitions. However, these techniques remain underexplored in automatic speech recognition (ASR), despite their potential to advance both the performance and interpretability of ASR systems. In this work, we adapt and systematically apply established interpretability methods such as logit lens, linear probing, and activation patching, to examine how acoustic and semantic information evolves across layers in ASR systems. Our experiments reveal previously unknown internal dynamics, including specific encoder-decoder interactions responsible for repetition hallucinations and semantic biases encoded deep within acoustic representations. These insights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
