Targeted Augmented Data for Audio Deepfake Detection
Marcella Astrid, Enjie Ghorbel, Djamila Aouada

TL;DR
This paper introduces a novel augmentation method for audio deepfake detection that synthesizes pseudo-fake data to improve model robustness and generalization against unseen manipulations.
Contribution
The paper proposes an adversarial-inspired augmentation technique that generates ambiguous pseudo-fake audio data to enhance deepfake detector robustness.
Findings
Improved generalization on unseen deepfake manipulations
Enhanced robustness of detection models with the proposed augmentation
Demonstrated effectiveness on two well-known architectures
Abstract
The availability of highly convincing audio deepfake generators highlights the need for designing robust audio deepfake detectors. Existing works often rely solely on real and fake data available in the training set, which may lead to overfitting, thereby reducing the robustness to unseen manipulations. To enhance the generalization capabilities of audio deepfake detectors, we propose a novel augmentation method for generating audio pseudo-fakes targeting the decision boundary of the model. Inspired by adversarial attacks, we perturb original real data to synthesize pseudo-fakes with ambiguous prediction probabilities. Comprehensive experiments on two well-known architectures demonstrate that the proposed augmentation contributes to improving the generalization capabilities of these architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
