Learning Group Actions In Disentangled Latent Image Representations
Farhana Hossain Swarnali, Miaomiao Zhang, Tonmoy Hossain

TL;DR
This paper presents an end-to-end method for learning group actions on latent image representations, automatically discovering transformation-relevant structures without manual partitioning, enhancing controllable image transformations.
Contribution
Introduces a novel framework that learns group actions on latent spaces with automatic disentanglement, using learnable masks within standard encoder-decoder architectures.
Findings
Successfully learned disentangled latent factors for group actions across diverse datasets
Improved controllability of image transformations in latent space
Enhanced downstream classification performance using learned representations
Abstract
Modeling group actions on latent representations enables controllable transformations of high-dimensional image data. Prior works applying group-theoretic priors or modeling transformations typically operate in the high-dimensional data space, where group actions apply uniformly across the entire input, making it difficult to disentangle the subspace that varies under transformations. While latent-space methods offer greater flexibility, they still require manual partitioning of latent variables into equivariant and invariant subspaces, limiting the ability to robustly learn and operate group actions within the representation space. To address this, we introduce a novel end-to-end framework that for the first time learns group actions on latent image manifolds, automatically discovering transformation-relevant structures without manual intervention. Our method uses learnable binary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning
