Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis
Bohan Zeng, Ling Yang, Siyu Li, Jiaming Liu, Zixiang Zhang, Juanxi, Tian, Kaixin Zhu, Yongzhen Guo, Fu-Yun Wang, Minkai Xu, Stefano Ermon, Wentao, Zhang

TL;DR
Trans4D introduces a novel framework for realistic, geometry-aware 4D scene synthesis from text, enabling complex object deformations and scene transitions with improved quality over existing methods.
Contribution
It presents a new text-to-4D synthesis approach combining large language models and a geometry-aware transition network for realistic scene deformation.
Findings
Outperforms state-of-the-art in 4D scene quality
Produces accurate and high-quality scene transitions
Effectively models complex object deformations
Abstract
Recent advances in diffusion models have demonstrated exceptional capabilities in image and video generation, further improving the effectiveness of 4D synthesis. Existing 4D generation methods can generate high-quality 4D objects or scenes based on user-friendly conditions, benefiting the gaming and video industries. However, these methods struggle to synthesize significant object deformation of complex 4D transitions and interactions within scenes. To address this challenge, we propose Trans4D, a novel text-to-4D synthesis framework that enables realistic complex scene transitions. Specifically, we first use multi-modal large language models (MLLMs) to produce a physic-aware scene description for 4D scene initialization and effective transition timing planning. Then we propose a geometry-aware 4D transition network to realize a complex scene-level 4D transition based on the plan,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence
MethodsDiffusion
