Learning Abstract World Models with a Group-Structured Latent Space

Thomas Delliaux; Nguyen-Khanh Vu; Vincent Fran\c{c}ois-Lavet; Elise van der Pol; Emmanuel Rachelson

arXiv:2506.01529·cs.LG·May 20, 2026

Learning Abstract World Models with a Group-Structured Latent Space

Thomas Delliaux, Nguyen-Khanh Vu, Vincent Fran\c{c}ois-Lavet, Elise van der Pol, Emmanuel Rachelson

PDF

1 Repo 3 Reviews

TL;DR

This paper introduces a method for learning abstract world models with structured latent spaces that incorporate geometric priors and symmetries, improving generalization and interpretability in RL tasks.

Contribution

The authors propose a novel framework that embeds known symmetries into latent space representations, enhancing model accuracy and disentanglement in environment modeling.

Findings

01

Better latent transition predictions than unstructured models

02

Improved learning in RL tasks with rotational and translational features

03

Results show more disentangled and simpler representations

Abstract

Learning meaningful abstract models of Markov Decision Processes (MDPs) is crucial for improving generalization from limited data. In this work, we show how geometric priors can be imposed on the low-dimensional representation manifold of a learned transition model. We incorporate known symmetric structures via appropriate choices of the latent space and the associated group actions, which encode prior knowledge about invariances in the environment. In addition, our framework allows the embedding of additional unstructured information alongside these symmetries. We show experimentally that this leads to better predictions of the latent transition model than fully unstructured approaches, as well as better learning on downstream RL tasks, in environments with rotational and translational features, including in first-person views of 3D environments. Additionally, our experiments show that…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

- The work tackles a highly relevant problem of improving world model latent and dynamical interpretability. - While some earlier work has investigated learning latent representations of high-dimensional state/observation spaces by explicitly specifying functions to which the representations should be invariant/equivariant, this paper takes the solution further by selecting a structured latent space where a set of actions produces "legal" transitions, according to some observed geometric behavi

Weaknesses

- Your MDP definition is missing the stochasticity of the transition kernel T. Specifically, T should be a mapping T: S x A x S -> [0, 1]. It is okay that you only consider the deterministic case T: S x A x S -> {0, 1} throughout the rest of the formulation and solution, but the definition should align with the mainstream understanding of an MDP. What you have is more of a state-and-reward state machine. - By replacing the learned transition's composition rule with a fixed, equivariant operator

Reviewer 02Rating 2Confidence 4

Strengths

- strong theoretical foundations; illustrated with intuitive examples - compelling concept of efficient, low dimensional representations suitable for downstream tasks - paper is well structured

Weaknesses

- the approach requires prior knowledge about the symmetries in the environments. while the authors state that using more complex symmetries is up to future research, symmetries have to be discovered by humans beforehand, which limits generic applicability - the function σ is introduced to disentangle the representation, while details how this is implemented are missing. Moreover, the results in Appendix E indicate that this additional component may actually not be necessary. - It is unclear t

Reviewer 03Rating 2Confidence 4

Strengths

* Systematic generalization in world models is important field of study * Performance is reported in both prediction and RL settings. * The paper is well motivated.

Weaknesses

* My main concern lies on the construction of the priors. In the torus world it makes sense that the quotient group captures the structure of the environment, but in more complex environments its not so easy to say. In atari games and continuous control settings (like mujoco) it's not clear whether its feasible to use priors meaningfully. * Related to this: the disentanglement loss seems to require that one pre-specifies the latent dimensions that each action should act on. While I get that the

Code & Models

Repositories

github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling