DNN-based uncertainty estimation for weighted DNN-HMM ASR
Jos\'e Novoa, Josu\'e Fredes, N\'estor Becerra Yoma

TL;DR
This paper introduces a DNN-based method to estimate uncertainty in noisy speech observations, enhancing the robustness of DNN-HMM speech recognition systems across various noise conditions.
Contribution
It proposes a novel DNN approach to estimate uncertainty directly from enhanced noisy observations, improving speech recognition accuracy under noisy environments.
Findings
Improved recognition accuracy with uncertainty integration.
Effective uncertainty estimation across multiple noise conditions.
Enhanced robustness in multi-condition training scenarios.
Abstract
In this paper, the uncertainty is defined as the mean square error between a given enhanced noisy observation vector and the corresponding clean one. Then, a DNN is trained by using enhanced noisy observation vectors as input and the uncertainty as output with a training database. In testing, the DNN receives an enhanced noisy observation vector and delivers the estimated uncertainty. This uncertainty in employed in combination with a weighted DNN-HMM based speech recognition system and compared with an existing estimation of the noise cancelling uncertainty variance based on an additive noise model. Experiments were carried out with Aurora-4 task. Results with clean, multi-noise and multi-condition training are presented.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Target Tracking and Data Fusion in Sensor Networks
