PhyWorld: Physics-Faithful World Model for Video Generation
Pu Zhao, Juyi Lin, Timothy Rupprecht, Arash Akbari, Chence Yang, Rahul Chowdhury, Elaheh Motamedi, Arman Akbari, Yumei He, Chen Wang, Geng Yuan, Weiwei Chen, Yanzhi Wang

TL;DR
PhyWorld is a novel video generation model that produces physically faithful and temporally coherent scene continuations, enhancing the realism and physical plausibility of world simulations for Physical AI.
Contribution
It introduces a two-stage post-training approach combining flow matching fine-tuning and physics preference alignment to improve physical consistency in video generation.
Findings
PhyWorld achieves higher video consistency scores than baselines.
It improves physical plausibility scores on dedicated benchmarks.
Post-training with continuation and physics signals enhances world simulation quality.
Abstract
World simulators can provide safe and scalable environments for training Physical AI systems before real-world deployment. Large video generation models are emerging as a promising basis for such simulators because they can generate diverse and realistic visual futures. However, using them as world simulators requires physically faithful video continuations, namely, generated videos that preserve the physical state implied by the conditioning input, and evolve in ways consistent with basic physical principles. We propose PhyWorld, a video generation world model designed to produce temporally coherent and physically faithful scene continuations through two-stage post-training. In the first stage, we improve video-to-video continuation with flow matching fine-tuning, encouraging stable visual attributes and coherent motion dynamics across frames. In the second stage, we align generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
