Learning Modular Structures That Generalize Out-of-Distribution

Arjun Ashok; Chaitanya Devaguptapu; Vineeth Balasubramanian

arXiv:2208.03753·cs.LG·August 9, 2022

Learning Modular Structures That Generalize Out-of-Distribution

Arjun Ashok, Chaitanya Devaguptapu, Vineeth Balasubramanian

PDF

TL;DR

This paper introduces a modular training approach that enhances out-of-distribution generalization by focusing on features consistently reused across multiple domains, using neuron-level regularizers and a probabilistic binary mask.

Contribution

The paper proposes a novel method combining neuron regularizers and a probabilistic mask to extract modular sub-networks for improved O.O.D. generalization.

Findings

01

Achieves better O.O.D. performance than baseline networks

02

Uses neuron-level regularizers to identify reusable features

03

Preliminary results on benchmark datasets support effectiveness

Abstract

Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.