PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation
Yidong Huang, Zun Wang, Han Lin, Dong-Ki Kim, Shayegan Omidshafiei, Jaehong Yoon, Jaemin Cho, Yue Zhang, Mohit Bansal

TL;DR
PhyMotion introduces a physics-grounded, multi-dimensional motion reward for human video generation, improving realism by evaluating 3D motion plausibility in a physics simulator.
Contribution
It proposes a novel structured reward based on 3D human trajectories evaluated in a physics simulator, enhancing motion realism in video generation.
Findings
PhyMotion correlates better with human judgments than existing rewards.
Optimizing PhyMotion improves motion realism by +68 Elo in human evaluations.
The three axes of the reward provide complementary supervision signals.
Abstract
Generating realistic human motion is a central yet unsolved challenge in video generation. While reinforcement learning (RL)-based post-training has driven recent gains in general video quality, extending it to human motion remains bottlenecked by a reward signal that cannot reliably score motion realism. Existing video rewards primarily rely on 2D perceptual signals, without explicitly modeling the 3D body state, contact, and dynamics underlying articulated human motion, and often assign high scores to videos with floating bodies or physically implausible movements. To address this, we propose PhyMotion, a structured, fine-grained motion reward that grounds recovered 3D human trajectories in a physics simulator and evaluates motion quality along multiple dimensions of physical feasibility. Concretely, we recover SMPL body meshes from generated videos, retarget them onto a humanoid in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
