"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

Jing Gu; Xian Liu; Yu Zeng; Ashwin Nagarajan; Fangrui Zhu; Daniel Hong; Yue Fan; Qianqi Yan; Kaiwen Zhou; Ming-Yu Liu; Xin Eric Wang

arXiv:2507.13428·cs.CV·February 10, 2026

"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

Jing Gu, Xian Liu, Yu Zeng, Ashwin Nagarajan, Fangrui Zhu, Daniel Hong, Yue Fan, Qianqi Yan, Kaiwen Zhou, Ming-Yu Liu, Xin Eric Wang

PDF

TL;DR

This paper introduces PhyWorldBench, a comprehensive benchmark for evaluating the physical realism of text-to-video models, including a novel anti-physics category and a zero-shot evaluation method using large language models.

Contribution

The paper presents PhyWorldBench, a new benchmark with diverse physical scenarios and a zero-shot evaluation approach, to systematically assess and improve physics adherence in video generation models.

Findings

01

12 state-of-the-art models evaluated across 1050 prompts

02

Models struggle with complex physical interactions and anti-physics prompts

03

Recommendations provided for prompt design to improve physical fidelity

Abstract

Video generation models have achieved remarkable progress in creating high-quality, photorealistic content. However, their ability to accurately simulate physical phenomena remains a critical and unresolved challenge. This paper presents PhyWorldBench, a comprehensive benchmark designed to evaluate video generation models based on their adherence to the laws of physics. The benchmark covers multiple levels of physical phenomena, ranging from fundamental principles such as object motion and energy conservation to more complex scenarios involving rigid body interactions and human or animal motion. Additionally, we introduce a novel Anti-Physics category, where prompts intentionally violate real-world physics, enabling the assessment of whether models can follow such instructions while maintaining logical consistency. Besides large-scale human evaluation, we also design a simple yet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.