Consistent4D: Consistent 360{\deg} Dynamic Object Generation from   Monocular Video

Yanqin Jiang; Li Zhang; Jin Gao; Weimin Hu; Yao Yao

arXiv:2311.02848·cs.CV·November 7, 2023·1 cites

Consistent4D: Consistent 360{\deg} Dynamic Object Generation from Monocular Video

Yanqin Jiang, Li Zhang, Jin Gao, Weimin Hu, Yao Yao

PDF

Open Access

TL;DR

Consistent4D introduces a monocular video-based method for 4D dynamic object generation, leveraging a 3D-aware diffusion model and novel loss functions to ensure spatial and temporal consistency without multi-view data or calibration.

Contribution

It proposes a new framework for 4D object reconstruction from monocular videos using diffusion models and a cascade DyNeRF with interpolation-based consistency loss.

Findings

01

Achieves competitive results with prior methods

02

Demonstrates effective 4D dynamic object generation from monocular videos

03

Shows advantages in text-to-3D generation tasks

Abstract

In this paper, we present Consistent4D, a novel approach for generating 4D dynamic objects from uncalibrated monocular videos. Uniquely, we cast the 360-degree dynamic object reconstruction as a 4D generation problem, eliminating the need for tedious multi-view data collection and camera calibration. This is achieved by leveraging the object-level 3D-aware image diffusion model as the primary supervision signal for training Dynamic Neural Radiance Fields (DyNeRF). Specifically, we propose a Cascade DyNeRF to facilitate stable convergence and temporal continuity under the supervision signal which is discrete along the time axis. To achieve spatial and temporal consistency, we further introduce an Interpolation-driven Consistency Loss. It is optimized by minimizing the discrepancy between rendered frames from DyNeRF and interpolated frames from a pre-trained video interpolation model.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion