Unsupervised Representation Learning - an Invariant Risk Minimization Perspective
Yotam Norman, Ron Meir

TL;DR
This paper introduces an unsupervised framework for invariant risk minimization that learns robust representations from unlabeled data by aligning feature distributions, using methods like PICA and VIAE, and demonstrates effectiveness on synthetic and real datasets.
Contribution
It extends IRM to unlabeled data settings by redefining invariance through feature distribution alignment and introduces two novel methods, PICA and VIAE.
Findings
Effective in capturing invariant structures
Generalizes across environments without labels
Works on synthetic and real datasets
Abstract
We propose a novel unsupervised framework for \emph{Invariant Risk Minimization} (IRM), extending the concept of invariance to settings where labels are unavailable. Traditional IRM methods rely on labeled data to learn representations that are robust to distributional shifts across environments. In contrast, our approach redefines invariance through feature distribution alignment, enabling robust representation learning from unlabeled data. We introduce two methods within this framework: Principal Invariant Component Analysis (PICA), a linear method that extracts invariant directions under Gaussian assumptions, and Variational Invariant Autoencoder (VIAE), a deep generative model that separates environment-invariant and environment-dependent latent factors. Our approach is based on a novel ``unsupervised'' structural causal model and supports environment-conditioned sample-generation…
Peer Reviews
Decision·ICLR 2026 Poster
Overall I liked the topic of the paper, the discussion presented and the developed methods. I think there are some original contributions that are presented clearly, and the work could attract some interest from crowd interested in these topics.
There are some apparent weaknesses that I think the authors should take into account when revising the paper: 1. The motivation for the problem is not entirely clear. "Risk minimization" is a term mostly used in the context of a prediction problem, and I think the unsupervised setting is inherently different, hence a more suitable name for the work or solution might be something like an Invariant autoencoder/Environment-Invariant autoencoder etc. Now given this framing, it is not entirely clear
- The problem formulation--extending IRM to unlabeled multi-environment data--appears to be novel. - The two-method approach (linear PICA + nonlinear VIAE) provides complementary perspectives with clear mathematical exposition. - The fairness application demonstrates a natural use case where environment-invariant features correspond to removing sensitive attributes.
- One of the core weaknesses is that the objective is conceptually ill-defined. The paper redefines invariance as matching the marginals of the representations across environments but does not justify how this invariance serves IRM's goal of robust prediction. For instance, the learnt invariant features could be useless for any feasible downstream tasks. - Connecting to the previous point, there is no identifiability analysis of the learned representation; the model could learn arbitrary rotati
- The notion of unsupervised invariant algorithms is compelling - The VIAE approach seems promising (as illustrated by the environment transfer algorithm.).
- Some notations are confused (see below) - PICA is very constrained because the difference of two empirical covariance matrices $\Sigma_1-\Sigma_2$ is likely to have a null kernel. VIAE is far more satisfactory. - Experiments in section 4.2.1 are insufficient. For instance, you could construct a CMNIST problem with labels that only depend on the shape of the digit, and a RevCMNIST problem using the same patterns but with labels that only depend on the color. The invariant features in the sense
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
