Dual-label Deep LSTM Dereverberation For Speaker Verification
Hao Zhang, Stephen Zahorian, Xiao Chen, Peter Guzewich, Xiaoyu Liu

TL;DR
This paper introduces a dual-label deep LSTM approach for dereverberation in speaker verification, using neural networks to map reverberant speech features to clean speech and auxiliary features, improving verification accuracy.
Contribution
It proposes a novel dual-label LSTM neural network model that simultaneously estimates clean speech features and auxiliary spectral information for dereverberation.
Findings
Reduced equal error rates in speaker verification
Effective mapping of reverberant to clean speech features
Improved robustness against reverberation distortions
Abstract
In this paper, we present a reverberation removal approach for speaker verification, utilizing dual-label deep neural networks (DNNs). The networks perform feature mapping between the spectral features of reverberant and clean speech. Long short term memory recurrent neural networks (LSTMs) are trained to map corrupted Mel filterbank (MFB) features to two sets of labels: i) the clean MFB features, and ii) either estimated pitch tracks or the fast Fourier transform (FFT) spectrogram of clean speech. The performance of reverberation removal is evaluated by equal error rates (EERs) of speaker verification experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
