Transferable Adversarial Attacks on Audio Deepfake Detection
Muhammad Umar Farooq, Awais Khan, Kutub Uddin, Khalid Mahmood Malik

TL;DR
This paper demonstrates that current audio deepfake detection systems are highly vulnerable to transferable adversarial attacks generated by a novel GAN-based framework, highlighting the need for improved robustness.
Contribution
It introduces a transferable GAN-based adversarial attack framework that effectively evaluates and exposes vulnerabilities of state-of-the-art audio deepfake detection methods.
Findings
Significant accuracy drops in SOTA ADD systems under attack
High-quality adversarial examples maintain transcription and perceptual integrity
Vulnerabilities are consistent across multiple datasets and attack scenarios
Abstract
Audio deepfakes pose significant threats, including impersonation, fraud, and reputation damage. To address these risks, audio deepfake detection (ADD) techniques have been developed, demonstrating success on benchmarks like ASVspoof2019. However, their resilience against transferable adversarial attacks remains largely unexplored. In this paper, we introduce a transferable GAN-based adversarial attack framework to evaluate the effectiveness of state-of-the-art (SOTA) ADD systems. By leveraging an ensemble of surrogate ADD models and a discriminator, the proposed approach generates transferable adversarial attacks that better reflect real-world scenarios. Unlike previous methods, the proposed framework incorporates a self-supervised audio model to ensure transcription and perceptual integrity, resulting in high-quality adversarial attacks. Experimental results on benchmark dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Image and Signal Denoising Methods
