TL;DR
This paper introduces D3, a training-free method for detecting AI-generated videos by analyzing second-order temporal features, leveraging a theoretical framework to identify artifacts with high accuracy and efficiency.
Contribution
The paper presents a novel second-order feature analysis framework and a training-free detection method, D3, that effectively identifies AI-generated videos using temporal artifacts.
Findings
D3 outperforms previous methods by 10.39% on GenVideo dataset.
D3 demonstrates high computational efficiency and robustness.
Theoretical analysis reveals fundamental differences in second-order features between real and synthetic videos.
Abstract
The evolution of video generation techniques, such as Sora, has made it increasingly easy to produce high-fidelity AI-generated videos, raising public concern over the dissemination of synthetic content. However, existing detection methodologies remain limited by their insufficient exploration of temporal artifacts in synthetic videos. To bridge this gap, we establish a theoretical framework through second-order dynamical analysis under Newtonian mechanics, subsequently extending the Second-order Central Difference features tailored for temporal artifact detection. Building on this theoretical foundation, we reveal a fundamental divergence in second-order feature distributions between real and AI-generated videos. Concretely, we propose Detection by Difference of Differences (D3), a novel training-free detection method that leverages the above second-order temporal discrepancies. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
