Invariance-based Multi-Clustering of Latent Space Embeddings for Equivariant Learning
Chandrajit Bajaj, Avik Roy, Haoran Zhang

TL;DR
This paper introduces a novel method for disentangling invariant and equivariant features in latent space of VAEs using Lie group manifolds and mixture models, improving clustering and recognition performance.
Contribution
It proposes a new deep, group-invariant learning approach with a modified ELBO for better unsupervised variational clustering of invariant and equivariant features.
Findings
Effective disentanglement of invariant and equivariant representations.
Significant improvements in learning rate and recognition accuracy.
Superior image reconstruction compared to existing models.
Abstract
Variational Autoencoders (VAEs) have been shown to be remarkably effective in recovering model latent spaces for several computer vision tasks. However, currently trained VAEs, for a number of reasons, seem to fall short in learning invariant and equivariant clusters in latent space. Our work focuses on providing solutions to this problem and presents an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning. Simultaneously implementing a novel separation of semantic and equivariant variables of the latent space representation, we formulate a modified Evidence Lower BOund (ELBO) by using a mixture model pdf like Gaussian mixtures for invariant cluster embeddings that allows superior unsupervised variational clustering. Our experiments show that this model effectively learns to disentangle the invariant and equivariant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · AI in cancer detection
