Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling
Sen Cui, Jingheng Ma

TL;DR
This paper introduces Hamiltonian World Models, a physically grounded approach to generative world modeling that encodes observations into a structured latent space and evolves them via Hamiltonian-inspired dynamics for more meaningful and stable long-term predictions.
Contribution
It proposes a novel Hamiltonian-based framework for world models that emphasizes physical interpretability, stability, and action controllability in embodied decision-making tasks.
Findings
Hamiltonian structure may improve interpretability and data efficiency.
The approach aims for long-horizon stability in predictions.
Practical challenges include modeling friction, contact, and deformable objects.
Abstract
World models have recently re-emerged as a central paradigm for embodied intelligence, robotics, autonomous driving, and model-based reinforcement learning. However, current world model research is often dominated by three partially separated routes: 2D video-generative models that emphasize visual future synthesis, 3D scene-centric models that emphasize spatial reconstruction, and JEPA-like latent models that emphasize abstract predictive representations. While each route has made important progress, they still struggle to provide physically reliable, action-controllable, and long-horizon stable predictions for embodied decision making. In this paper, we argue that the bottleneck of world models is no longer only whether they can generate realistic futures, but whether those futures are physically meaningful and useful for action. We propose \emph{Hamiltonian World Models} as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
