Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models
Hanwen Liang, Yuyang Yin, Dejia Xu, Hanxue Liang, Zhangyang Wang,, Konstantinos N. Plataniotis, Yao Zhao, Yunchao Wei

TL;DR
Diffusion4D introduces an efficient framework for 4D content generation that ensures spatial-temporal consistency, leveraging a novel 4D-aware diffusion model, a curated dataset, and explicit 4D construction techniques.
Contribution
The paper presents a new scalable 4D diffusion framework that improves speed and consistency in 4D asset generation, incorporating novel metrics and reconstruction losses.
Findings
Outperforms prior methods in efficiency and 4D consistency
Generates high-fidelity 4D assets within minutes
Achieves superior multi-view and temporal coherence
Abstract
The availability of large-scale multimodal datasets and advancements in diffusion models have significantly accelerated progress in 4D content generation. Most prior approaches rely on multiple image or video diffusion models, utilizing score distillation sampling for optimization or generating pseudo novel views for direct supervision. However, these methods are hindered by slow optimization speeds and multi-view inconsistency issues. Spatial and temporal consistency in 4D geometry has been extensively explored respectively in 3D-aware diffusion models and traditional monocular video diffusion models. Building on this foundation, we propose a strategy to migrate the temporal consistency in video diffusion models to the spatial-temporal consistency required for 4D generation. Specifically, we present a novel framework, \textbf{Diffusion4D}, for efficient and scalable 4D content…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Computer Graphics and Visualization Techniques
MethodsSparse Evolutionary Training · Diffusion
