Dual-label Deep LSTM Dereverberation For Speaker Verification

Hao Zhang; Stephen Zahorian; Xiao Chen; Peter Guzewich; Xiaoyu Liu

arXiv:1809.03868·eess.AS·September 12, 2018

Dual-label Deep LSTM Dereverberation For Speaker Verification

Hao Zhang, Stephen Zahorian, Xiao Chen, Peter Guzewich, Xiaoyu Liu

PDF

Open Access

TL;DR

This paper introduces a dual-label deep LSTM approach for dereverberation in speaker verification, using neural networks to map reverberant speech features to clean speech and auxiliary features, improving verification accuracy.

Contribution

It proposes a novel dual-label LSTM neural network model that simultaneously estimates clean speech features and auxiliary spectral information for dereverberation.

Findings

01

Reduced equal error rates in speaker verification

02

Effective mapping of reverberant to clean speech features

03

Improved robustness against reverberation distortions

Abstract

In this paper, we present a reverberation removal approach for speaker verification, utilizing dual-label deep neural networks (DNNs). The networks perform feature mapping between the spectral features of reverberant and clean speech. Long short term memory recurrent neural networks (LSTMs) are trained to map corrupted Mel filterbank (MFB) features to two sets of labels: i) the clean MFB features, and ii) either estimated pitch tracks or the fast Fourier transform (FFT) spectrogram of clean speech. The performance of reverberation removal is evaluated by equal error rates (EERs) of speaker verification experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing