Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection
Dat Nguyen, Marcella Astrid, Anis Kacem, Enjie Ghorbel, Djamila Aouada

TL;DR
FakeSTormer is a novel deepfake detection method that models subtle spatio-temporal inconsistencies using multi-task learning and data synthesis, achieving superior generalization on challenging benchmarks.
Contribution
The paper introduces a multi-task learning framework with auxiliary branches and a data synthesis strategy to improve deepfake video detection and generalization.
Findings
Outperforms recent state-of-the-art methods on multiple benchmarks.
Effectively captures subtle spatio-temporal artifacts.
Enhances generalization to unseen deepfake generation methods.
Abstract
Detecting deepfake videos is highly challenging given the complexity of characterizing spatio-temporal artifacts. Most existing methods rely on binary classifiers trained using real and fake image sequences, therefore hindering their generalization capabilities to unseen generation methods. Moreover, with the constant progress in generative Artificial Intelligence (AI), deepfake artifacts are becoming imperceptible at both the spatial and the temporal levels, making them extremely difficult to capture. To address these issues, we propose a fine-grained deepfake video detection approach called FakeSTormer that enforces the modeling of subtle spatio-temporal inconsistencies while avoiding overfitting. Specifically, we introduce a multi-task learning framework that incorporates two auxiliary branches for explicitly attending artifact-prone spatial and temporal regions. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
MethodsFocus
