Phy124: Fast Physics-Driven 4D Content Generation from a Single Image
Jiajing Lin, Zhenzhong Wang, Yongjie Hou, Yuzhou Tang, Min Jiang

TL;DR
Phy124 introduces a fast, physics-driven method for generating realistic 4D dynamic content from a single image, avoiding diffusion models and enabling controllable, real-world physics adherence with reduced inference time.
Contribution
The paper presents Phy124, a novel approach that integrates physical simulation into 4D content generation, significantly improving speed and realism over existing diffusion-based methods.
Findings
High-fidelity 4D content generated with reduced inference time
Effective control over 4D dynamics via external forces
State-of-the-art performance in 4D content realism
Abstract
4D content generation focuses on creating dynamic 3D objects that change over time. Existing methods primarily rely on pre-trained video diffusion models, utilizing sampling processes or reference videos. However, these approaches face significant challenges. Firstly, the generated 4D content often fails to adhere to real-world physics since video diffusion models do not incorporate physical priors. Secondly, the extensive sampling process and the large number of parameters in diffusion models result in exceedingly time-consuming generation processes. To address these issues, we introduce Phy124, a novel, fast, and physics-driven method for controllable 4D content generation from a single image. Phy124 integrates physical simulation directly into the 4D generation process, ensuring that the resulting 4D content adheres to natural physical laws. Phy124 also eliminates the use of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization
MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
