PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos
Meng Cao, Haoran Tang, Haoze Zhao, Hangyu Guo, Jiaheng Liu, Ge Zhang,, Ruyang Liu, Qiang Sun, Ian Reid, Xiaodan Liang

TL;DR
This paper introduces PhysGame, a benchmark for evaluating physical commonsense violations in gameplay videos, and proposes PhysVLM, a knowledge-enhanced video LLM, to improve understanding of physical glitches.
Contribution
The paper presents PhysGame as a new benchmark, curates datasets for instruction tuning and preference optimization, and develops PhysVLM to advance physical commonsense reasoning in video LLMs.
Findings
Current open-source video LLMs underperform proprietary models.
PhysVLM achieves state-of-the-art results on PhysGame and general benchmarks.
Datasets and methods significantly improve physical commonsense understanding.
Abstract
Recent advancements in video-based large language models (Video LLMs) have witnessed the emergence of diverse capabilities to reason and interpret dynamic visual content. Among them, gameplay videos stand out as a distinctive data source, often containing glitches that defy physics commonsense. This characteristic renders them an effective benchmark for assessing the under-explored capability of physical commonsense understanding in video LLMs. In this paper, we propose PhysGame as a pioneering benchmark to evaluate physical commonsense violations in gameplay videos. PhysGame comprises 880 videos associated with glitches spanning four fundamental domains (i.e., mechanics, kinematics, optics, and material properties) and across 12 distinct physical commonsense. Through extensively evaluating various state-ofthe-art video LLMs, our findings reveal that the performance of current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Digital and Cyber Forensics
