TL;DR
GeoWorld introduces a hyperbolic geometric approach to improve energy-based predictive world models for multi-step visual planning, addressing limitations of Euclidean latent representations and long-horizon prediction degradation.
Contribution
It proposes a Hyperbolic JEPA for preserving geometric and hierarchical structures in latent space and introduces Geometric Reinforcement Learning for stable multi-step planning.
Findings
Achieved around 3% SR improvement in 3-step planning.
Achieved around 2% SR improvement in 4-step planning.
Demonstrated effectiveness on CrossTask and COIN datasets.
Abstract
Energy-based predictive world models provide a powerful approach for multi-step visual planning by reasoning over latent energy landscapes rather than generating pixels. However, existing approaches face two major challenges: (i) their latent representations are typically learned in Euclidean space, neglecting the underlying geometric and hierarchical structure among states, and (ii) they struggle with long-horizon prediction, which leads to rapid degradation across extended rollouts. To address these challenges, we introduce GeoWorld, a geometric world model that preserves geometric structure and hierarchical relations through a Hyperbolic JEPA, which maps latent representations from Euclidean space onto hyperbolic manifolds. We further introduce Geometric Reinforcement Learning for energy-based optimization, enabling stable multi-step planning in hyperbolic latent space. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
