PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification
Wonbin Kim, Hyun-seo Shin, Ju-ho Kim, Jungwoo Heo, Chan-yeong Lim and, Ha-Jin Yu

TL;DR
This paper introduces PAS, a novel partial additive speech data augmentation technique that enhances noise robustness in speaker verification systems, outperforming traditional methods in reducing error rates.
Contribution
The paper proposes a new partial additive speech (PAS) method for data augmentation, improving noise robustness in speaker verification beyond traditional additive noise approaches.
Findings
PAS outperforms traditional additive noise in EER reduction
Relative improvements of 4.64% and 5.01% in SE-ResNet34 and ECAPA-TDNN
Analysis confirms effectiveness through attention modules and embedding visualization
Abstract
Background noise reduces speech intelligibility and quality, making speaker verification (SV) in noisy environments a challenging task. To improve the noise robustness of SV systems, additive noise data augmentation method has been commonly used. In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environments. The experimental results demonstrate that PAS outperforms traditional additive noise in terms of equal error rates (EER), with relative improvements of 4.64% and 5.01% observed in SE-ResNet34 and ECAPA-TDNN. We also show the effectiveness of proposed method by analyzing attention modules and visualizing speaker embeddings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
