A Causal Ordering Prior for Unsupervised Representation Learning
Avinash Kori, Pedro Sanchez, Konstantinos Vilouras, Ben Glocker,, Sotirios A. Tsaftaris

TL;DR
This paper introduces a novel unsupervised method for representation learning that incorporates causal relationships among latent variables, using a causal ordering prior inspired by functional causal models and Hessian-based loss.
Contribution
It proposes a fully unsupervised approach to causal representation learning that relaxes independence assumptions and enforces causal order in latent space without auxiliary data.
Findings
Demonstrates causal ordering in latent space using Hessian-based loss
Achieves identifiable causal representations without supervision
Extends variational inference to causal latent models
Abstract
Unsupervised representation learning with variational inference relies heavily on independence assumptions over latent variables. Causal representation learning (CRL), however, argues that factors of variation in a dataset are, in fact, causally related. Allowing latent variables to be correlated, as a consequence of causal relationships, is more realistic and generalisable. So far, provably identifiable methods rely on: auxiliary information, weak labels, and interventional or even counterfactual data. Inspired by causal discovery with functional causal models, we propose a fully unsupervised representation learning method that considers a data generation process with a latent additive noise model (ANM). We encourage the latent space to follow a causal ordering via loss function based on the Hessian of the latent distribution.
Peer Reviews
Decision·Submitted to ICLR 2024
The paper attempts to provide a new method for causal representation learning, combining different results from the causal discovery and representation learning literature in a novel way.
1. Theorem 2 is, on its own and in its current formulation, incomplete. Supposedly, it exploits the results in (Kivva et al., 2022). However, the statement says, _"invertible mixing functions"_. For any invertible mixing functions, without further assumptions (none are stated in Thm. 2), it is possible to build counterexamples to identifiability in the i.i.d. setting based on the Darmois construction [1]. It also appears non-rigorous to state that the mixing functions are not _"identically distr
- The paper studies an important problem, i.e., how to learn latent variables with causal relations. - It nicely combines ideas from different fields: Identifiability of latent variable model with piecewise linear mixing, Gaussian mixture models, score based causal discovery. - The main paper is generally easy to follow - Experimental results look generally promising.
- There do not seem to be any completely novel ideas in the paper (score based causal discovery, gmm latent prior, identifiability without auxiliary information). (this is a minor point) - For the identifiability part, there are some questions regarding the combination of assumptions (see questions). - The proofs in Appendix A could be much clearer (see questions), and, more generally, all math parts should be checked carefully (being slightly imprecise can make it very difficult to follow the d
- The research direction is important because causal representation learning is a promising extension of neural methods that aims to leverage additional causal information for robustness and generalization. - The proposed method coVAE shows improved MCC and R^2 metrics over prior experimental baselines, i.e. improved recovery of meaningful latents.
- The abstract claims that all provable identifiable methods rely on additonal information, however the main work (Kivva et al.) that the authors cite, show that we do not need additional information, therefore, this claim should be revised. - It's hard to understand how Assumptions 2, 3, 5 together work. They seem unrelated assumptions with different motivations -- Assumption 2, 3 on latent SCMs is more directly related to the paper since it's inherently causal; however, Assumption 5 on GMMs i
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Domain Adaptation and Few-Shot Learning · Machine Learning in Healthcare
MethodsVariational Inference
