A theory of independent mechanisms for extrapolation in generative models
Michel Besserve, R\'emy Sun, Dominik Janzing, Bernhard Sch\"olkopf

TL;DR
This paper introduces a theoretical framework for enhancing the extrapolation abilities of generative models by enforcing independence of mechanisms, addressing issues of unobserved causal structures and overparameterization.
Contribution
It proposes a weaker form of identifiability based on mechanism independence and demonstrates how explicit enforcement improves extrapolation in generative models.
Findings
Classical training can hinder extrapolation capabilities.
Enforcing independence of mechanisms improves extrapolation.
Experiments show practical benefits on real-world data.
Abstract
Generative models can be trained to emulate complex empirical data, but are they useful to make predictions in the context of previously unobserved environments? An intuitive idea to promote such extrapolation capabilities is to have the architecture of such model reflect a causal graph of the true data generating process, such that one can intervene on each node independently of the others. However, the nodes of this graph are usually unobserved, leading to overparameterization and lack of identifiability of the causal structure. We develop a theoretical framework to address this challenging situation by defining a weaker form of identifiability, based on the principle of independence of mechanisms. We demonstrate on toy examples that classical stochastic gradient descent can hinder the model's extrapolation capabilities, suggesting independence of mechanisms should be enforced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification
