Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection
Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang,, Siwei Lyu, Jizhong Han

TL;DR
This paper introduces a novel correlation distillation approach to improve the generalizability of cross-modal deepfake detection, supported by a new comprehensive dataset and experimental validation.
Contribution
It proposes a correlation distillation task to explicitly model cross-modal content correlation, enhancing detection across diverse deepfake generation methods.
Findings
Outperforms existing methods on CMDFD and FakeAVCeleb datasets.
Demonstrates improved generalizability across multiple deepfake types.
Provides a new dataset for cross-modal deepfake detection evaluation.
Abstract
With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes. While effective in their specific modalities, traditional detection methods fall short in addressing the generalizability of detection across diverse cross-modal deepfakes. This paper aims to explicitly learn potential cross-modal correlation to enhance deepfake detection towards various generation scenarios. Our approach introduces a correlation distillation task, which models the inherent cross-modal correlation based on content information. This strategy helps to prevent the model from overfitting merely to audio-visual synchronization. Additionally, we present the Cross-Modal Deepfake Dataset (CMDFD), a comprehensive dataset with four generation methods to evaluate the detection of diverse cross-modal deepfakes. The experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis
