Improving the Perturbation-Based Explanation of Deepfake Detectors   Through the Use of Adversarially-Generated Samples

Konstantinos Tsigos; Evlampios Apostolidis; Vasileios Mezaris

arXiv:2502.03957·cs.CV·February 7, 2025

Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples

Konstantinos Tsigos, Evlampios Apostolidis, Vasileios Mezaris

PDF

Open Access 2 Repos

TL;DR

This paper enhances deepfake detector explanations by integrating adversarially-generated samples to improve the accuracy and usefulness of perturbation-based visual explanations.

Contribution

It introduces a novel approach using adversarial samples to improve perturbation-based explanation methods for deepfake detectors.

Findings

01

Improved explanation accuracy in identifying manipulated regions

02

Enhanced performance of explanation methods with adversarial samples

03

Positive impact on explanation quality demonstrated through quantitative and qualitative analysis

Abstract

In this paper, we introduce the idea of using adversarially-generated samples of the input images that were classified as deepfakes by a detector, to form perturbation masks for inferring the importance of different input features and produce visual explanations. We generate these samples based on Natural Evolution Strategies, aiming to flip the original deepfake detector's decision and classify these samples as real. We apply this idea to four perturbation-based explanation methods (LIME, SHAP, SOBOL and RISE) and evaluate the performance of the resulting modified methods using a SOTA deepfake detection model, a benchmarking dataset (FaceForensics++) and a corresponding explanation evaluation framework. Our quantitative assessments document the mostly positive contribution of the proposed perturbation approach in the performance of explanation methods. Our qualitative analysis shows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Digital Media Forensic Detection

MethodsFLIP · Shapley Additive Explanations