LaWM: Least Action World Models for Long-Horizon Physical Consistency from Visual Observations
Qixin Xiao, Maani Ghaffari

TL;DR
LaWM introduces a physics-inspired latent world model that enforces the Principle of Least Action, leading to more physically consistent long-horizon predictions from visual data.
Contribution
It operationalizes the Principle of Least Action in learned latent space using a latent variational integrator for improved physical consistency.
Findings
Enhances physical invariance and background consistency.
Improves motion smoothness and geometric prediction.
Outperforms baselines in synthetic and robotic benchmarks.
Abstract
Learning predictive world models from visual observations is a core problem in embodied AI, with applications to model-based reinforcement learning and robotic planning. Existing latent world models typically generate future states with unconstrained neural transition functions, while modern video generation systems often prioritize perceptual plausibility or introduce physical structure through auxiliary losses, external guidance, or separate dynamics modules. As a result, long-horizon rollouts can remain weakly grounded in the physical principles that govern real dynamics, leading to compounding error, energy drift, and physically inconsistent futures. We propose Least Action World Models (LaWM), a latent world-modeling framework that operationalizes the Principle of Least Action in learned visual latent space: future rollouts are governed by a learned Lagrangian action functional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
