Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level   Blending and Spatiotemporal Adapter Tuning

Zhiyuan Yan; Yandan Zhao; Shen Chen; Mingyi Guo; Xinghe Fu; Taiping; Yao; Shouhong Ding; Li Yuan

arXiv:2408.17065·cs.CV·December 3, 2024·2 cites

Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning

Zhiyuan Yan, Yandan Zhao, Shen Chen, Mingyi Guo, Xinghe Fu, Taiping, Yao, Shouhong Ding, Li Yuan

PDF

Open Access

TL;DR

This paper introduces a novel approach for deepfake video detection that combines video-level blending data and a lightweight spatiotemporal adapter to improve generalization, efficiency, and balanced artifact learning across diverse forgeries.

Contribution

It proposes a new video-level blending data technique and a lightweight spatiotemporal adapter to enhance deepfake detection models' generalization and efficiency.

Findings

01

Effective generalization to unseen forgeries

02

Balanced learning of spatial and temporal artifacts

03

Improved efficiency with lightweight model design

Abstract

Three key challenges hinder the development of current deepfake video detection: (1) Temporal features can be complex and diverse: how can we identify general temporal artifacts to enhance model generalization? (2) Spatiotemporal models often lean heavily on one type of artifact and ignore the other: how can we ensure balanced learning from both? (3) Videos are naturally resource-intensive: how can we tackle efficiency without compromising accuracy? This paper attempts to tackle the three challenges jointly. First, inspired by the notable generality of using image-level blending data for image forgery detection, we investigate whether and how video-level blending can be effective in video. We then perform a thorough analysis and identify a previously underexplored temporal forgery artifact: Facial Feature Drift (FFD), which commonly exists across different forgeries. To reproduce FFD,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques

MethodsAdapter