Adversarial defense for deep speaker recognition using hybrid adversarial training
Monisankha Pal, Arindam Jati, Raghuveer Peri, Chin-Cheng Hsu, Wael, AbdAlmageed, Shrikanth Narayanan

TL;DR
This paper introduces a hybrid adversarial training method for deep speaker recognition that leverages multiple loss functions to craft stronger adversarial examples, significantly improving robustness against attacks.
Contribution
The paper proposes a novel hybrid adversarial training approach combining supervised and unsupervised cues, enhancing defense against adversarial attacks in speaker recognition systems.
Findings
HAT improves adversarial accuracy by over 3% against PGD and CW attacks.
HAT maintains high accuracy on benign speech samples.
Outperforms existing PGD-based adversarial training methods.
Abstract
Deep neural network based speaker recognition systems can easily be deceived by an adversary using minuscule imperceptible perturbations to the input speech samples. These adversarial attacks pose serious security threats to the speaker recognition systems that use speech biometric. To address this concern, in this work, we propose a new defense mechanism based on a hybrid adversarial training (HAT) setup. In contrast to existing works on countermeasures against adversarial attacks in deep speaker recognition that only use class-boundary information by supervised cross-entropy (CE) loss, we propose to exploit additional information from supervised and unsupervised cues to craft diverse and stronger perturbations for adversarial training. Specifically, we employ multi-task objectives using CE, feature-scattering (FS), and margin losses to create adversarial perturbations and include them…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
