Speaker Re-identification with Speaker Dependent Speech Enhancement
Yanpei Shi, Qiang Huang, Thomas Hain

TL;DR
This paper presents a novel cascaded approach combining speaker-dependent speech enhancement with speaker recognition, improving identification accuracy in noisy environments through joint training and evaluation on real-world data.
Contribution
It introduces a joint-optimized framework that integrates speaker-dependent speech enhancement with recognition, demonstrating superior performance over baselines in noisy conditions.
Findings
Improved speaker recognition accuracy in noisy environments.
Enhanced speech quality with speaker-dependent enhancement.
Outperforms baseline methods across various noise conditions.
Abstract
While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments. Here speech enhancement methods have traditionally allowed improved performance. The recent works have shown that adapting speech enhancement can lead to further gains. This paper introduces a novel approach that cascades speech enhancement and speaker recognition. In the first step, a speaker embedding vector is generated , which is used in the second step to enhance the speech quality and re-identify the speakers. Models are trained in an integrated framework with joint optimisation. The proposed approach is evaluated using the Voxceleb1 dataset, which aims to assess speaker recognition in real world situations. In addition three types of noise at different signal-noise-ratios were added for this work. The obtained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
