"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models
Jing Gu, Xian Liu, Yu Zeng, Ashwin Nagarajan, Fangrui Zhu, Daniel Hong, Yue Fan, Qianqi Yan, Kaiwen Zhou, Ming-Yu Liu, Xin Eric Wang

TL;DR
This paper introduces PhyWorldBench, a comprehensive benchmark for evaluating the physical realism of text-to-video models, including a novel anti-physics category and a zero-shot evaluation method using large language models.
Contribution
The paper presents PhyWorldBench, a new benchmark with diverse physical scenarios and a zero-shot evaluation approach, to systematically assess and improve physics adherence in video generation models.
Findings
12 state-of-the-art models evaluated across 1050 prompts
Models struggle with complex physical interactions and anti-physics prompts
Recommendations provided for prompt design to improve physical fidelity
Abstract
Video generation models have achieved remarkable progress in creating high-quality, photorealistic content. However, their ability to accurately simulate physical phenomena remains a critical and unresolved challenge. This paper presents PhyWorldBench, a comprehensive benchmark designed to evaluate video generation models based on their adherence to the laws of physics. The benchmark covers multiple levels of physical phenomena, ranging from fundamental principles such as object motion and energy conservation to more complex scenarios involving rigid body interactions and human or animal motion. Additionally, we introduce a novel Anti-Physics category, where prompts intentionally violate real-world physics, enabling the assessment of whether models can follow such instructions while maintaining logical consistency. Besides large-scale human evaluation, we also design a simple yet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
