Noise Robust Speech Recognition Using Multi-Channel Based Channel   Selection And ChannelWeighting

Zhaofeng Zhang; Xiong Xiao; Longbiao Wang; EngSiong Chng; and Haizhou; Li

arXiv:1604.03276·cs.SD·October 4, 2016

Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting

Zhaofeng Zhang, Xiong Xiao, Longbiao Wang, EngSiong Chng, and Haizhou, Li

PDF

Open Access

TL;DR

This paper explores microphone channel selection and weighting techniques to improve speech recognition accuracy in noisy environments, demonstrating that channel weighting outperforms selection and rivals beamforming without phase information.

Contribution

It introduces novel channel weighting methods based on ML criteria that enhance robustness in noisy speech recognition scenarios.

Findings

01

Channel weighting outperforms channel selection.

02

Weighted sum of channels improves recognition accuracy.

03

Channel weighting rivals beamforming in noisy conditions.

Abstract

In this paper, we study several microphone channel selection and weighting methods for robust automatic speech recognition (ASR) in noisy conditions. For channel selection, we investigate two methods based on the maximum likelihood (ML) criterion and minimum autoencoder reconstruction criterion, respectively. For channel weighting, we produce enhanced log Mel filterbank coefficients as a weighted sum of the coefficients of all channels. The weights of the channels are estimated by using the ML criterion with constraints. We evaluate the proposed methods on the CHiME-3 noisy ASR task. Experiments show that channel weighting significantly outperforms channel selection due to its higher flexibility. Furthermore, on real test data in which different channels have different gains of the target signal, the channel weighting method performs equally well or better than the MVDR beamforming,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing