Information Theoretic Analysis of DNN-HMM Acoustic Modeling
Pranay Dighe, Afsaneh Asaei, Herv\'e Bourlard

TL;DR
This paper introduces an information theoretic framework to quantitatively evaluate acoustic models in speech recognition, focusing on the information transfer and robustness of DNN-HMM systems without requiring full speech recognition tasks.
Contribution
It presents a novel information theoretic approach to assess the accuracy and robustness of DNN-HMM acoustic models, providing insights into model shortcomings and the role of hidden layers.
Findings
DNN posterior probabilities improve information transfer in acoustic models.
The analysis reveals the impact of hidden layers on model robustness.
Low-dimensional models can enhance acoustic modeling for better decoding performance.
Abstract
We propose an information theoretic framework for quantitative assessment of acoustic modeling for hidden Markov model (HMM) based automatic speech recognition (ASR). Acoustic modeling yields the probabilities of HMM sub-word states for a short temporal window of speech acoustic features. We cast ASR as a communication channel where the input sub-word probabilities convey the information about the output HMM state sequence. The quality of the acoustic model is thus quantified in terms of the information transmitted through this channel. The process of inferring the most likely HMM state sequence from the sub-word probabilities is known as decoding. HMM based decoding assumes that an acoustic model yields accurate state-level probabilities and the data distribution given the underlying hidden state is independent of any other state in the sequence. We quantify 1) the acoustic model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Machine Fault Diagnosis Techniques
