Towards Quantitative Evaluation of Explainable AI Methods for Deepfake   Detection

Konstantinos Tsigos; Evlampios Apostolidis; Spyridon Baxevanakis,; Symeon Papadopoulos; Vasileios Mezaris

arXiv:2404.18649·cs.CV·April 30, 2024

Towards Quantitative Evaluation of Explainable AI Methods for Deepfake Detection

Konstantinos Tsigos, Evlampios Apostolidis, Spyridon Baxevanakis,, Symeon Papadopoulos, Vasileios Mezaris

PDF

Open Access 1 Repo

TL;DR

This paper introduces a framework for quantitatively evaluating explanation methods for deepfake detection, focusing on their ability to identify influential image regions by measuring the impact of adversarial modifications.

Contribution

It proposes a novel evaluation framework for explanation methods in deepfake detection and compares several methods, highlighting LIME's superior performance.

Findings

01

LIME outperforms other explanation methods in identifying influential regions.

02

The framework effectively measures explanation quality through adversarial attack impact.

03

LIME's explanations lead to greater reductions in detection accuracy when regions are modified.

Abstract

In this paper we propose a new framework for evaluating the performance of explanation methods on the decisions of a deepfake detector. This framework assesses the ability of an explanation method to spot the regions of a fake image with the biggest influence on the decision of the deepfake detector, by examining the extent to which these regions can be modified through a set of adversarial attacks, in order to flip the detector's prediction or reduce its initial prediction; we anticipate a larger drop in deepfake detection accuracy and prediction, for methods that spot these regions more accurately. Based on this framework, we conduct a comparative study using a state-of-the-art model for deepfake detection that has been trained on the FaceForensics++ dataset, and five explanation methods from the literature. The findings of our quantitative and qualitative evaluations document the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

idt-iti/xai-deepfakes
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)

MethodsSparse Evolutionary Training · Local Interpretable Model-Agnostic Explanations · FLIP