Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Yanran Zhang; Ziyi Wang; Wenzhao Zheng; Zheng Zhu; Jie Zhou; Jiwen Lu

arXiv:2512.05044·cs.CV·December 5, 2025

Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Yanran Zhang, Ziyi Wang, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu

PDF

Open Access

TL;DR

This paper introduces MoRe4D, a framework that jointly reconstructs 3D geometry and generates motion to synthesize dynamic 4D scenes from a single image, utilizing a new large-scale dataset and diffusion-based trajectory generation.

Contribution

It presents a novel joint reconstruction and motion generation framework for 4D scene synthesis from a single image, along with a new dataset and diffusion-based trajectory generator.

Findings

01

MoRe4D produces high-quality, multi-view consistent 4D scenes.

02

The method effectively integrates geometry and dynamics from a single image.

03

Experiments demonstrate rich dynamic details in synthesized 4D scenes.

Abstract

Generating interactive and dynamic 4D scenes from a single static image remains a core challenge. Most existing generate-then-reconstruct and reconstruct-then-generate methods decouple geometry from motion, causing spatiotemporal inconsistencies and poor generalization. To address these, we extend the reconstruct-then-generate framework to jointly perform Motion generation and geometric Reconstruction for 4D Synthesis (MoRe4D). We first introduce TrajScene-60K, a large-scale dataset of 60,000 video samples with dense point trajectories, addressing the scarcity of high-quality 4D scene data. Based on this, we propose a diffusion-based 4D Scene Trajectory Generator (4D-STraG) to jointly generate geometrically consistent and motion-plausible 4D point trajectories. To leverage single-view priors, we design a depth-guided motion normalization strategy and a motion-aware module for effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis