Disentangled Representation Learning through Unsupervised Symmetry Group Discovery
Barth\'el\'emy Dang-Nhu, Louis Annabi, Sylvain Argentieri

TL;DR
This paper introduces an unsupervised method for agents to autonomously discover symmetry group structures in environments, enabling improved disentangled representation learning without prior group knowledge.
Contribution
It removes the need for prior symmetry group assumptions by proposing algorithms for discovering group structures and learning LSBD representations through interaction data.
Findings
Successfully discovers group decompositions in diverse environments
Outperforms existing LSBD methods in experiments
Proves identifiability of symmetry group decomposition under minimal assumptions
Abstract
Symmetry-based disentangled representation learning leverages the group structure of environment transformations to uncover the latent factors of variation. Prior approaches to symmetry-based disentanglement have required strong prior knowledge of the symmetry group's structure, or restrictive assumptions about the subgroup properties. In this work, we remove these constraints by proposing a method whereby an embodied agent autonomously discovers the group structure of its action space through unsupervised interaction with the environment. We prove the identifiability of the true symmetry group decomposition under minimal assumptions, and derive two algorithms: one for discovering the group decomposition from interaction data, and another for learning Linear Symmetry-Based Disentangled (LSBD) representations without assuming specific subgroup properties. Our method is validated on three…
Peer Reviews
Decision·ICLR 2026 Poster
1. **Novelty:** The core contribution—learning an LSBD representation via three steps—is instructive for disentanglement learning. This shifts the paradigm from requiring prior knowledge to autonomously learning it from interaction. 2. **Theoretical Grounding:** The paper is built on a solid theoretical foundation, providing formal proofs for its key claims. This guarantees the existence of actions belonging to the same subgroup (Theorem 2) and the of learning a Linear Symmetry-Based Disentangl
1. Multi-Stage Pipeline: The method is not end-to-end. It requires training two separate models sequentially: first the A-VAE to learn action matrices, and then the GMA-VAE to learn the final representation. The authors acknowledge this limitation, noting that a future direction would be to unify these steps into a single optimization process. 2. Limited Scope of Environments: The experiments are conducted on synthetic, visually simple datasets. While these are well-suited for proving the gro
My understanding of the framework is relatively superficial and I did not check the maths carefully. Strengths: - Careful explanations. - Thorough comparison with other methods and with different metrics of disentanglement, which show an advantage over other unsupervised methods.
- How is the method less supervised than LSBD-VAE? It seems pretty supervised to me, with access to actions and consequences of these actions? What does it mean that "Both of our methods rely on a strong assumption which requires the available actions to be disentangled"?
1. Clear LSBD pipeline that separates (i) equivariant pretraining, (ii) group discovery, (iii) block-structured LSBD learning, with explicit assumptions and a clustering rule grounded in group algebra. 2. Good baseline experiments within the LSBD family (Forward-VAE/SOBDRL/LSBD-VAE variants) and consistent reporting on multiple disentanglement metrics and multi-step prediction. 3. Reproducibility: code, dataset generation, and hyperparameters are described and (per the authors) released.
1. **Lack of realistic interactive experiments.** The environments are synthetic (Flatland, COIL with permutations, 3DShapes). There are no tests on widely used embodied/control suites (e.g., DeepMind Control/ProcGen/Habitat/ManiSkill) where continuous groups (SO(2), SE(2), SE(3)) and sensor noise dominate, and where symmetries are only approximate. By contrast, prior interactive symmetry/LSBD works motivate interaction explicitly and evaluate on non-trivial dynamics (e.g., SOBDRL/Forward-VAE),
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
