PAS: Partial Additive Speech Data Augmentation Method for Noise Robust   Speaker Verification

Wonbin Kim; Hyun-seo Shin; Ju-ho Kim; Jungwoo Heo; Chan-yeong Lim and; Ha-Jin Yu

arXiv:2307.10628·eess.AS·July 21, 2023·1 cites

PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

Wonbin Kim, Hyun-seo Shin, Ju-ho Kim, Jungwoo Heo, Chan-yeong Lim and, Ha-Jin Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces PAS, a novel partial additive speech data augmentation technique that enhances noise robustness in speaker verification systems, outperforming traditional methods in reducing error rates.

Contribution

The paper proposes a new partial additive speech (PAS) method for data augmentation, improving noise robustness in speaker verification beyond traditional additive noise approaches.

Findings

01

PAS outperforms traditional additive noise in EER reduction

02

Relative improvements of 4.64% and 5.01% in SE-ResNet34 and ECAPA-TDNN

03

Analysis confirms effectiveness through attention modules and embedding visualization

Abstract

Background noise reduces speech intelligibility and quality, making speaker verification (SV) in noisy environments a challenging task. To improve the noise robustness of SV systems, additive noise data augmentation method has been commonly used. In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environments. The experimental results demonstrate that PAS outperforms traditional additive noise in terms of equal error rates (EER), with relative improvements of 4.64% and 5.01% observed in SE-ResNet34 and ECAPA-TDNN. We also show the effectiveness of proposed method by analyzing attention modules and visualizing speaker embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rst0070/Partial_Additive_Speech
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing