What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
Minh-Quan Le, Yuanzhi Zhu, Vicky Kalogeiton, Dimitris Samaras

TL;DR
This paper introduces NewtonRewards, a post-training framework for video generation that enforces Newtonian physics using verifiable proxies, significantly improving physical realism and motion consistency in generated videos.
Contribution
It presents the first physics-grounded post-training method utilizing verifiable rewards based on measurable proxies for physical properties in video generation.
Findings
Improves physical plausibility and motion smoothness in generated videos.
Maintains performance under out-of-distribution conditions.
Outperforms prior methods on Newtonian Motion Primitives benchmark.
Abstract
Recent video diffusion models can synthesize visually compelling clips, yet often violate basic physical laws-objects float, accelerations drift, and collisions behave inconsistently-revealing a persistent gap between visual realism and physical realism. We propose , the first physics-grounded post-training framework for video generation based on . Instead of relying on human or VLM feedback, extracts from generated videos using frozen utility models: optical flow serves as a proxy for velocity, while high-level appearance features serve as a proxy for mass. These proxies enable explicit enforcement of Newtonian structure through two complementary rewards: a Newtonian kinematic constraint enforcing constant-acceleration dynamics, and a mass conservation reward preventing trivial,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music Technology and Sound Studies · Human Motion and Animation
