Unsupervised Multimodal Deepfake Detection Using Intra- and Cross-Modal Inconsistencies
Mulin Tian, Mahyar Khayatkhoei, Joe Mathai, Wael AbdAlmageed

TL;DR
This paper introduces an unsupervised deepfake detection method that identifies intra- and cross-modal inconsistencies in videos, effectively detecting deepfakes without needing labeled data or prior deepfake examples.
Contribution
It presents a novel unsupervised approach based on inconsistency detection, which is scalable, generalizable, reliable, and explainable, outperforming existing methods on challenging datasets.
Findings
Outperforms state-of-the-art unsupervised methods on FakeAVCeleb dataset
Does not require real samples for each identity during inference
Can pinpoint exact locations of modality inconsistencies
Abstract
Deepfake videos present an increasing threat to society with potentially negative impact on criminal justice, democracy, and personal safety and privacy. Meanwhile, detecting deepfakes, at scale, remains a very challenging task that often requires labeled training data from existing deepfake generation methods. Further, even the most accurate supervised deepfake detection methods do not generalize to deepfakes generated using new generation methods. In this paper, we propose a novel unsupervised method for detecting deepfake videos by directly identifying intra-modal and cross-modal inconsistency between video segments. The fundamental hypothesis behind the proposed detection method is that motion or identity inconsistencies are inevitable in deepfake videos. We will mathematically and empirically support this hypothesis, and then proceed to constructing our method grounded in our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Image Enhancement Techniques
