Learning Robust Models Using The Principle of Independent Causal   Mechanisms

Jens M\"uller; Robert Schmier; Lynton Ardizzone; Carsten Rother and; Ullrich K\"othe

arXiv:2010.07167·cs.LG·February 9, 2021

Learning Robust Models Using The Principle of Independent Causal Mechanisms

Jens M\"uller, Robert Schmier, Lynton Ardizzone, Carsten Rother and, Ullrich K\"othe

PDF

TL;DR

This paper introduces a gradient-based learning framework inspired by the principle of independent causal mechanisms, enabling neural networks to focus on invariant relations and improve robustness under distribution shifts.

Contribution

It proposes a novel training method based on ICM that enhances model robustness and identifies true causal mechanisms under certain conditions.

Findings

01

Models focus on invariant relations across environments.

02

The framework recovers true causal mechanisms under specific conditions.

03

Enhanced generalization to unseen scenarios.

Abstract

Standard supervised learning breaks down under data distribution shift. However, the principle of independent causal mechanisms (ICM, Peters et al. (2017)) can turn this weakness into an opportunity: one can take advantage of distribution shift between different environments during training in order to obtain more robust models. We propose a new gradient-based learning framework whose objective function is derived from the ICM principle. We show theoretically and experimentally that neural networks trained in this framework focus on relations remaining invariant across environments and ignore unstable ones. Moreover, we prove that the recovered stable relations correspond to the true causal mechanisms under certain conditions. In both regression and classification, the resulting models generalize well to unseen scenarios where traditionally trained models fail.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.