Distinguish Any Fake Videos: Unleashing the Power of Large-scale Data and Motion Features
Lichuan Ji, Yingqi Lin, Zhenhua Huang, Yan Han, Xiaogang Xu, Jiafei, Wu, Chong Wang, Zhe Liu

TL;DR
This paper introduces a large-scale video dataset and a novel dual-branch 3D transformer model that effectively detects AI-generated videos by leveraging motion and appearance features, achieving high accuracy and generalization.
Contribution
The paper presents GenVidDet, a comprehensive dataset for AI-generated video detection, and DuB3D, a new dual-branch transformer model that combines motion and visual cues for improved detection.
Findings
Achieved 96.77% accuracy in distinguishing real and generated videos.
GenVidDet contains over 2.66 million diverse video instances.
DuB3D demonstrates strong generalization to unseen video types.
Abstract
The development of AI-Generated Content (AIGC) has empowered the creation of remarkably realistic AI-generated videos, such as those involving Sora. However, the widespread adoption of these models raises concerns regarding potential misuse, including face video scams and copyright disputes. Addressing these concerns requires the development of robust tools capable of accurately determining video authenticity. The main challenges lie in the dataset and neural classifier for training. Current datasets lack a varied and comprehensive repository of real and generated content for effective discrimination. In this paper, we first introduce an extensive video dataset designed specifically for AI-Generated Video Detection (GenVidDet). It includes over 2.66 M instances of both real and generated videos, varying in categories, frames per second, resolutions, and lengths. The comprehensiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections
