PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

Qiyuan Zhang; Biao Gong; Shuai Tan; Zheng Zhang; Yujun Shen; Xing Zhu; Yuyuan Li; Kelu Yao; Chunhua Shen; Changqing Zou

arXiv:2601.11087·cs.CV·January 19, 2026

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

Qiyuan Zhang, Biao Gong, Shuai Tan, Zheng Zhang, Yujun Shen, Xing Zhu, Yuyuan Li, Kelu Yao, Chunhua Shen, Changqing Zou

PDF

Open Access

TL;DR

PhysRVG introduces a physics-aware reinforcement learning framework for video generation, enforcing physical collision rules directly to improve realism, and extends it with a unified fine-tuning approach called MDcycle, validated on new benchmarks.

Contribution

This work pioneers a physics-aware RL paradigm for video generation, integrating physical collision rules directly into high-dimensional models, and proposes a unified fine-tuning framework called MDcycle.

Findings

01

Enhanced physical realism in generated videos.

02

Effective enforcement of collision rules in high-dimensional spaces.

03

Successful validation on new PhysRVGBench benchmark.

Abstract

Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation. This gap highlights a critical limitation in rendering rigid body motion, a core tenet of classical mechanics. While computer graphics and physics-based simulators can easily model such collisions using Newton formulas, modern pretrain-finetune paradigms discard the concept of object rigidity during pixel-level global denoising. Even perfectly correct mathematical constraints are treated as suboptimal solutions (i.e., conditions) during model optimization in post-training, fundamentally limiting the physical realism of generated videos. Motivated by these considerations, we introduce, for the first time, a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis