Study of Pre-processing Defenses against Adversarial Attacks on   State-of-the-art Speaker Recognition Systems

Sonal Joshi; Jes\'us Villalba; Piotr \.Zelasko; Laureano; Moro-Vel\'azquez; and Najim Dehak

arXiv:2101.08909·eess.AS·October 28, 2024

Study of Pre-processing Defenses against Adversarial Attacks on State-of-the-art Speaker Recognition Systems

Sonal Joshi, Jes\'us Villalba, Piotr \.Zelasko, Laureano, Moro-Vel\'azquez, and Najim Dehak

PDF

TL;DR

This paper investigates the vulnerability of state-of-the-art speaker recognition systems to white-box adversarial attacks and evaluates four pre-processing defenses, finding that Parallel WaveGAN combined with randomized smoothing offers the best protection.

Contribution

It introduces and compares four pre-processing defenses against adversarial attacks on speaker recognition systems, demonstrating their effectiveness in a white-box attack scenario.

Findings

01

SR systems are highly vulnerable to BIM, PGD, and CW attacks.

02

PWG with randomized smoothing significantly improves robustness, achieving 93% accuracy.

03

Defense methods outperform baseline adversarial training in protecting SR systems.

Abstract

Adversarial examples to speaker recognition (SR) systems are generated by adding a carefully crafted noise to the speech signal to make the system fail while being imperceptible to humans. Such attacks pose severe security risks, making it vital to deep-dive and understand how much the state-of-the-art SR systems are vulnerable to these attacks. Moreover, it is of greater importance to propose defenses that can protect the systems against these attacks. Addressing these concerns, this paper at first investigates how state-of-the-art x-vector based SR systems are affected by white-box adversarial attacks, i.e., when the adversary has full knowledge of the system. x-Vector based SR systems are evaluated against white-box adversarial attacks common in the literature like fast gradient sign method (FGSM), basic iterative method (BIM)--a.k.a. iterative-FGSM--, projected gradient descent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRandomized Smoothing · Dropout · WGAN-GP Loss · Phase Shuffle · Convolution · Tanh Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Dense Connections · HuMan(Expedia)||How do I get a human at Expedia? · WaveGAN