Grab-3D: Detecting AI-Generated Videos from 3D Geometric Temporal Consistency

Wenhan Chen; Sezer Karaoglu; Theo Gevers

arXiv:2512.13665·cs.CV·December 16, 2025

Grab-3D: Detecting AI-Generated Videos from 3D Geometric Temporal Consistency

Wenhan Chen, Sezer Karaoglu, Theo Gevers

PDF

Open Access

TL;DR

Grab-3D introduces a geometry-aware transformer that leverages 3D geometric temporal consistency to effectively detect AI-generated videos, outperforming existing methods and generalizing well across different generators.

Contribution

The paper presents Grab-3D, a novel transformer framework utilizing 3D geometric features, specifically vanishing points, for improved detection of AI-generated videos.

Findings

01

Grab-3D outperforms state-of-the-art detectors.

02

Achieves robust cross-domain generalization.

03

Utilizes vanishing points for 3D geometric analysis.

Abstract

Recent advances in diffusion-based generation techniques enable AI models to produce highly realistic videos, heightening the need for reliable detection mechanisms. However, existing detection methods provide only limited exploration of the 3D geometric patterns present in generated videos. In this paper, we use vanishing points as an explicit representation of 3D geometry patterns, revealing fundamental discrepancies in geometric consistency between real and AI-generated videos. We introduce Grab-3D, a geometry-aware transformer framework for detecting AI-generated videos based on 3D geometric temporal consistency. To enable reliable evaluation, we construct an AI-generated video dataset of static scenes, allowing stable 3D geometric feature extraction. We propose a geometry-aware transformer equipped with geometric positional encoding, temporal-geometric attention, and an EMA-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis