TL;DR
This paper introduces a method for learning abstract world models with structured latent spaces that incorporate geometric priors and symmetries, improving generalization and interpretability in RL tasks.
Contribution
The authors propose a novel framework that embeds known symmetries into latent space representations, enhancing model accuracy and disentanglement in environment modeling.
Findings
Better latent transition predictions than unstructured models
Improved learning in RL tasks with rotational and translational features
Results show more disentangled and simpler representations
Abstract
Learning meaningful abstract models of Markov Decision Processes (MDPs) is crucial for improving generalization from limited data. In this work, we show how geometric priors can be imposed on the low-dimensional representation manifold of a learned transition model. We incorporate known symmetric structures via appropriate choices of the latent space and the associated group actions, which encode prior knowledge about invariances in the environment. In addition, our framework allows the embedding of additional unstructured information alongside these symmetries. We show experimentally that this leads to better predictions of the latent transition model than fully unstructured approaches, as well as better learning on downstream RL tasks, in environments with rotational and translational features, including in first-person views of 3D environments. Additionally, our experiments show that…
Peer Reviews
Decision·Submitted to ICLR 2026
- The work tackles a highly relevant problem of improving world model latent and dynamical interpretability. - While some earlier work has investigated learning latent representations of high-dimensional state/observation spaces by explicitly specifying functions to which the representations should be invariant/equivariant, this paper takes the solution further by selecting a structured latent space where a set of actions produces "legal" transitions, according to some observed geometric behavi
- Your MDP definition is missing the stochasticity of the transition kernel T. Specifically, T should be a mapping T: S x A x S -> [0, 1]. It is okay that you only consider the deterministic case T: S x A x S -> {0, 1} throughout the rest of the formulation and solution, but the definition should align with the mainstream understanding of an MDP. What you have is more of a state-and-reward state machine. - By replacing the learned transition's composition rule with a fixed, equivariant operator
- strong theoretical foundations; illustrated with intuitive examples - compelling concept of efficient, low dimensional representations suitable for downstream tasks - paper is well structured
- the approach requires prior knowledge about the symmetries in the environments. while the authors state that using more complex symmetries is up to future research, symmetries have to be discovered by humans beforehand, which limits generic applicability - the function σ is introduced to disentangle the representation, while details how this is implemented are missing. Moreover, the results in Appendix E indicate that this additional component may actually not be necessary. - It is unclear t
* Systematic generalization in world models is important field of study * Performance is reported in both prediction and RL settings. * The paper is well motivated.
* My main concern lies on the construction of the priors. In the torus world it makes sense that the quotient group captures the structure of the environment, but in more complex environments its not so easy to say. In atari games and continuous control settings (like mujoco) it's not clear whether its feasible to use priors meaningfully. * Related to this: the disentanglement loss seems to require that one pre-specifies the latent dimensions that each action should act on. While I get that the
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
