GM-DF: Generalized Multi-Scenario Deepfake Detection
Yingxin Lai, Zitong Yu, Jing Yang, Bin Li, Xiangui Kang, Linlin Shen

TL;DR
This paper introduces GM-DF, a unified deepfake detection framework that enhances generalization across multiple datasets and unseen scenarios through hybrid modeling, CLIP features, masked reconstruction, and meta-learning.
Contribution
The paper proposes a novel multi-scenario deepfake detection model with domain-specific features, cross-domain alignment, and meta-learning, and establishes a new benchmark for multi-dataset evaluation.
Findings
Significant improvement in generalization to unseen datasets.
Effective domain alignment and feature extraction techniques.
Robust detection performance demonstrated on five datasets.
Abstract
Existing face forgery detection usually follows the paradigm of training models in a single domain, which leads to limited generalization capacity when unseen scenarios and unknown attacks occur. In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets. We first find a rapid degradation of detection accuracy when models are directly trained on combined datasets due to the discrepancy across collection scenarios and generation methods. To address the above issue, a Generalized Multi-Scenario Deepfake Detection framework (GM-DF) is proposed to serve multiple real-world scenarios by a unified model. First, we propose a hybrid expert modeling approach for domain-specific real/forgery feature extraction. Besides, as for the commonality representation, we use CLIP to extract the common…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
MethodsALIGN · Contrastive Language-Image Pre-training
