An improved uncertainty decoding scheme with weighted samples for   DNN-HMM hybrid systems

Christian Huemmer; Ram\'on Fern\'andez Astudillo; Walter Kellermann

arXiv:1609.02082·cs.LG·September 8, 2016

An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems

Christian Huemmer, Ram\'on Fern\'andez Astudillo, Walter Kellermann

PDF

Open Access

TL;DR

This paper introduces an improved uncertainty decoding method for DNN-HMM systems that uses weighted samples to enhance recognition accuracy, especially in reverberant environments, by reducing word error rates.

Contribution

It proposes a novel weighted averaging approach for uncertainty decoding in DNN-HMM systems, improving performance over existing sampling methods.

Findings

01

Weighted DNN-output averaging reduces word error rates.

02

Uncertainty decoding improves recognition accuracy in reverberant conditions.

03

Method outperforms previous sampling-based approaches.

Abstract

In this paper, we advance a recently-proposed uncertainty decoding scheme for DNN-HMM (deep neural network - hidden Markov model) hybrid systems. This numerical sampling concept averages DNN outputs produced by a finite set of feature samples (drawn from a probabilistic distortion model) to approximate the posterior likelihoods of the context-dependent HMM states. As main innovation, we propose a weighted DNN-output averaging based on a minimum classification error criterion and apply it to a probabilistic distortion model for spatial diffuseness features. The experimental evaluation is performed on the 8-channel REVERB Challenge task using a DNN-HMM hybrid system with multichannel front-end signal enhancement. We show that the recognition accuracy of the DNN-HMM hybrid system improves by incorporating uncertainty decoding based on random sampling and that the proposed weighted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Underwater Acoustics Research