Learning where to learn: Gradient sparsity in meta and continual   learning

Johannes von Oswald; Dominic Zhao; Seijin Kobayashi; Simon Schug,; Massimo Caccia; Nicolas Zucchet; Jo\~ao Sacramento

arXiv:2110.14402·cs.LG·October 28, 2021·24 cites

Learning where to learn: Gradient sparsity in meta and continual learning

Johannes von Oswald, Dominic Zhao, Seijin Kobayashi, Simon Schug,, Massimo Caccia, Nicolas Zucchet, Jo\~ao Sacramento

PDF

1 Repo 1 Video

TL;DR

This paper introduces a method where meta-learning determines which neural network weights to update, leading to patterned sparsity that improves generalization and reduces interference in few-shot and continual learning tasks.

Contribution

It demonstrates that learning where to learn via sparse gradient updates enhances meta-learning performance and reveals problem-specific sparsity patterns.

Findings

01

Patterned sparsity emerges from the learning process.

02

Sparse learning improves generalization in few-shot tasks.

03

Meta-learned learning rates also promote sparsity.

Abstract

Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. This selective sparsity results in better generalization and less interference in a range of few-shot and continual learning problems. Moreover, we find that sparse learning also emerges in a more expressive model where learning rates are meta-learned. Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

johswald/learning_where_to_learn
pytorchOfficial

Videos

Learning where to learn: Gradient sparsity in meta and continual learning· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Human Pose and Action Recognition · Model Reduction and Neural Networks