Latent Spatiotemporal Adaptation for Generalized Face Forgery Video Detection
Daichi Zhang, Zihao Xiao, Jianmin Li, Shiming Ge

TL;DR
This paper introduces LAST, a novel method that models and adapts to the spatiotemporal patterns of face forgery videos in latent space, significantly improving generalization to unseen forgery methods.
Contribution
The paper proposes a latent spatiotemporal adaptation approach that enhances face forgery detection generalization by modeling patterns in latent space and using semi-supervised learning.
Findings
Achieves state-of-the-art results on public datasets.
Demonstrates strong generalization to unseen forgery methods.
Pre-training with self-supervised tasks improves robustness.
Abstract
Face forgery videos have caused severe public concerns, and many detectors have been proposed. However, most of these detectors suffer from limited generalization when detecting videos from unknown distributions, such as from unseen forgery methods. In this paper, we find that different forgery videos have distinct spatiotemporal patterns, which may be the key to generalization. To leverage this finding, we propose a Latent Spatiotemporal Adaptation~(LAST) approach to facilitate generalized face forgery video detection. The key idea is to optimize the detector adaptive to the spatiotemporal patterns of unknown videos in latent space to improve the generalization. Specifically, we first model the spatiotemporal patterns of face videos by incorporating a lightweight CNN to extract local spatial features of each frame and then cascading a vision transformer to learn the long-term…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Face recognition and analysis · Generative Adversarial Networks and Image Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Absolute Position Encodings
