Deepfake detectors are DUMB: A benchmark to assess adversarial training robustness under transferability constraints
Adrian Serrano, Erwan Umlil, Ronan Thomas

TL;DR
This paper evaluates the robustness of deepfake detectors against adversarial attacks under transferability constraints, revealing that adversarial training can both improve and impair performance depending on the scenario.
Contribution
It extends the DUMB benchmark to deepfake detection, providing a comprehensive evaluation of adversarial robustness across multiple detectors, attacks, and datasets.
Findings
Adversarial training improves robustness in in-distribution scenarios.
Cross-dataset robustness can be degraded by adversarial training.
Results highlight the importance of case-aware defense strategies.
Abstract
Deepfake detection systems deployed in real-world environments are subject to adversaries capable of crafting imperceptible perturbations that degrade model performance. While adversarial training is a widely adopted defense, its effectiveness under realistic conditions -- where attackers operate with limited knowledge and mismatched data distributions - remains underexplored. In this work, we extend the DUMB -- Dataset soUrces, Model architecture and Balance - and DUMBer methodology to deepfake detection. We evaluate detectors robustness against adversarial attacks under transferability constraints and cross-dataset configuration to extract real-world insights. Our study spans five state-of-the-art detectors (RECCE, SRM, XCeption, UCF, SPSL), three attacks (PGD, FGSM, FPBA), and two datasets (FaceForensics++ and Celeb-DF-V2). We analyze both attacker and defender perspectives mapping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
