Self-Supervised Graph Transformer for Deepfake Detection
Aminollah Khormali, and Jiann-Shiun Yuan

TL;DR
This paper introduces a self-supervised graph Transformer-based framework for deepfake detection that exhibits strong generalization across datasets, manipulations, and common video perturbations, with explainability features.
Contribution
The study presents a novel self-supervised pre-training approach combined with graph Transformer architecture for improved deepfake detection and interpretability.
Findings
Outperforms state-of-the-art methods in cross-dataset tests.
Demonstrates robustness against video compression and blur.
Provides explainability through relevancy maps.
Abstract
Deepfake detection methods have shown promising results in recognizing forgeries within a given dataset, where training and testing take place on the in-distribution dataset. However, their performance deteriorates significantly when presented with unseen samples. As a result, a reliable deepfake detection system must remain impartial to forgery types, appearance, and quality for guaranteed generalizable detection performance. Despite various attempts to enhance cross-dataset generalization, the problem remains challenging, particularly when testing against common post-processing perturbations, such as video compression or blur. Hence, this study introduces a deepfake detection framework, leveraging a self-supervised pre-training model that delivers exceptional generalization ability, withstanding common corruptions and enabling feature explainability. The framework comprises three key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Anomaly Detection Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Laplacian EigenMap · Byte Pair Encoding · Linear Layer · Softmax · Layer Normalization · Dense Connections · Dropout · Vision Transformer
