3DPhysVideo: Consistency-Guided Flow SDE for Video Generation via 3D Scene Reconstruction and Physical Simulation

Hwidong Kim; Yunho Kim; Tae-Kyun Kim

arXiv:2605.16795·cs.CV·May 19, 2026

3DPhysVideo: Consistency-Guided Flow SDE for Video Generation via 3D Scene Reconstruction and Physical Simulation

Hwidong Kim, Yunho Kim, Tae-Kyun Kim

PDF

TL;DR

3DPhysVideo introduces a physics-guided, training-free pipeline that generates realistic 3D scene videos from a single image by combining view synthesis, physics simulation, and consistency enforcement.

Contribution

It proposes a novel consistency-guided flow SDE method that enables physically plausible video generation from a single image without training, leveraging existing video models and physics simulation.

Findings

01

Outperforms state-of-the-art baselines on GPT scores and VideoPhy benchmark.

02

Successfully generates multi-object and fluid interaction videos from single images.

03

Efficiently runs on a single consumer GPU.

Abstract

Video generative models have made remarkable progress, yet they often yield visual artifacts that violate grounding in physical dynamics. Recent works such as PhysGen3D tackle single image-to-3D physics through mesh reconstruction and Physically-Based Rendering, but challenges remain in modeling fluid dynamics, multi-object interactions and photorealism. This work introduces 3DPhysVideo, a novel training-free pipeline that generates physically realistic videos from a single image. We repurpose an off-the-shelf video model for two stages. First, we use it as a novel view synthesizer to reconstruct complete 360-degree 3D scene geometry by guiding the image-to-video (I2V) flow model with rendered point clouds. Second, after applying physics solvers to this geometry, the physically simulated point cloud is used to guide the same I2V flow model to synthesize final, high-quality videos.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.