Adversarial Transformation of Spoofing Attacks for Voice Biometrics
Alejandro Gomez-Alanis, Jose A. Gonzalez-Lopez, Antonio M. Peinado

TL;DR
This paper introduces a new adversarial transformation network that creates sophisticated spoofing attacks capable of bypassing both anti-spoofing and speaker verification systems, revealing vulnerabilities in current voice biometric security.
Contribution
The study develops the first joint adversarial attack framework targeting combined voice biometric and anti-spoofing systems, demonstrating its effectiveness against state-of-the-art defenses.
Findings
ABTN outperforms existing adversarial techniques in white-box scenarios.
ABTN successfully fools both PAD and ASV systems in black-box settings.
The attacks significantly compromise the security of voice biometric systems.
Abstract
Voice biometric systems based on automatic speaker verification (ASV) are exposed to \textit{spoofing} attacks which may compromise their security. To increase the robustness against such attacks, anti-spoofing or presentation attack detection (PAD) systems have been proposed for the detection of replay, synthesis and voice conversion based attacks. Recently, the scientific community has shown that PAD systems are also vulnerable to adversarial attacks. However, to the best of our knowledge, no previous work have studied the robustness of full voice biometrics systems (ASV + PAD) to these new types of adversarial \textit{spoofing} attacks. In this work, we develop a new adversarial biometrics transformation network (ABTN) which jointly processes the loss of the PAD and ASV systems in order to generate white-box and black-box adversarial \textit{spoofing} attacks. The core idea of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Digital Media Forensic Detection · Natural Language Processing Techniques
