Learning Group Actions In Disentangled Latent Image Representations

Farhana Hossain Swarnali; Miaomiao Zhang; Tonmoy Hossain

arXiv:2512.04015·cs.CV·December 16, 2025

Learning Group Actions In Disentangled Latent Image Representations

Farhana Hossain Swarnali, Miaomiao Zhang, Tonmoy Hossain

PDF

Open Access

TL;DR

This paper presents an end-to-end method for learning group actions on latent image representations, automatically discovering transformation-relevant structures without manual partitioning, enhancing controllable image transformations.

Contribution

Introduces a novel framework that learns group actions on latent spaces with automatic disentanglement, using learnable masks within standard encoder-decoder architectures.

Findings

01

Successfully learned disentangled latent factors for group actions across diverse datasets

02

Improved controllability of image transformations in latent space

03

Enhanced downstream classification performance using learned representations

Abstract

Modeling group actions on latent representations enables controllable transformations of high-dimensional image data. Prior works applying group-theoretic priors or modeling transformations typically operate in the high-dimensional data space, where group actions apply uniformly across the entire input, making it difficult to disentangle the subspace that varies under transformations. While latent-space methods offer greater flexibility, they still require manual partitioning of latent variables into equivariant and invariant subspaces, limiting the ability to robustly learn and operate group actions within the representation space. To address this, we introduce a novel end-to-end framework that for the first time learns group actions on latent image manifolds, automatically discovering transformation-relevant structures without manual intervention. Our method uses learnable binary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning