DNN-based uncertainty estimation for weighted DNN-HMM ASR

Jos\'e Novoa; Josu\'e Fredes; N\'estor Becerra Yoma

arXiv:1705.10368·cs.SD·May 31, 2017

DNN-based uncertainty estimation for weighted DNN-HMM ASR

Jos\'e Novoa, Josu\'e Fredes, N\'estor Becerra Yoma

PDF

Open Access

TL;DR

This paper introduces a DNN-based method to estimate uncertainty in noisy speech observations, enhancing the robustness of DNN-HMM speech recognition systems across various noise conditions.

Contribution

It proposes a novel DNN approach to estimate uncertainty directly from enhanced noisy observations, improving speech recognition accuracy under noisy environments.

Findings

01

Improved recognition accuracy with uncertainty integration.

02

Effective uncertainty estimation across multiple noise conditions.

03

Enhanced robustness in multi-condition training scenarios.

Abstract

In this paper, the uncertainty is defined as the mean square error between a given enhanced noisy observation vector and the corresponding clean one. Then, a DNN is trained by using enhanced noisy observation vectors as input and the uncertainty as output with a training database. In testing, the DNN receives an enhanced noisy observation vector and delivers the estimated uncertainty. This uncertainty in employed in combination with a weighted DNN-HMM based speech recognition system and compared with an existing estimation of the noise cancelling uncertainty variance based on an additive noise model. Experiments were carried out with Aurora-4 task. Results with clean, multi-noise and multi-condition training are presented.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Target Tracking and Data Fusion in Sensor Networks