Phy124: Fast Physics-Driven 4D Content Generation from a Single Image

Jiajing Lin; Zhenzhong Wang; Yongjie Hou; Yuzhou Tang; Min Jiang

arXiv:2409.07179·cs.CV·September 12, 2024

Phy124: Fast Physics-Driven 4D Content Generation from a Single Image

Jiajing Lin, Zhenzhong Wang, Yongjie Hou, Yuzhou Tang, Min Jiang

PDF

Open Access

TL;DR

Phy124 introduces a fast, physics-driven method for generating realistic 4D dynamic content from a single image, avoiding diffusion models and enabling controllable, real-world physics adherence with reduced inference time.

Contribution

The paper presents Phy124, a novel approach that integrates physical simulation into 4D content generation, significantly improving speed and realism over existing diffusion-based methods.

Findings

01

High-fidelity 4D content generated with reduced inference time

02

Effective control over 4D dynamics via external forces

03

State-of-the-art performance in 4D content realism

Abstract

4D content generation focuses on creating dynamic 3D objects that change over time. Existing methods primarily rely on pre-trained video diffusion models, utilizing sampling processes or reference videos. However, these approaches face significant challenges. Firstly, the generated 4D content often fails to adhere to real-world physics since video diffusion models do not incorporate physical priors. Secondly, the extensive sampling process and the large number of parameters in diffusion models result in exceedingly time-consuming generation processes. To address these issues, we introduce Phy124, a novel, fast, and physics-driven method for controllable 4D content generation from a single image. Phy124 integrates physical simulation directly into the 4D generation process, ensuring that the resulting 4D content adheres to natural physical laws. Phy124 also eliminates the use of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings