PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models
Qiyuan Zhang, Biao Gong, Shuai Tan, Zheng Zhang, Yujun Shen, Xing Zhu, Yuyuan Li, Kelu Yao, Chunhua Shen, Changqing Zou

TL;DR
PhysRVG introduces a physics-aware reinforcement learning framework for video generation, enforcing physical collision rules directly to improve realism, and extends it with a unified fine-tuning approach called MDcycle, validated on new benchmarks.
Contribution
This work pioneers a physics-aware RL paradigm for video generation, integrating physical collision rules directly into high-dimensional models, and proposes a unified fine-tuning framework called MDcycle.
Findings
Enhanced physical realism in generated videos.
Effective enforcement of collision rules in high-dimensional spaces.
Successful validation on new PhysRVGBench benchmark.
Abstract
Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation. This gap highlights a critical limitation in rendering rigid body motion, a core tenet of classical mechanics. While computer graphics and physics-based simulators can easily model such collisions using Newton formulas, modern pretrain-finetune paradigms discard the concept of object rigidity during pixel-level global denoising. Even perfectly correct mathematical constraints are treated as suboptimal solutions (i.e., conditions) during model optimization in post-training, fundamentally limiting the physical realism of generated videos. Motivated by these considerations, we introduce, for the first time, a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
