Conditional Computation for Continual Learning

Min Lin; Jie Fu; Yoshua Bengio

arXiv:1906.06635·cs.LG·June 18, 2019·6 cites

Conditional Computation for Continual Learning

Min Lin, Jie Fu, Yoshua Bengio

PDF

Open Access

TL;DR

This paper explores how conditional computation and parameter sharing strategies can mitigate catastrophic forgetting in neural networks, proposing a method called conditional rehearsal that selectively revisits interfered examples during continual learning.

Contribution

It introduces a parameter sharing analysis framework and a novel rehearsal method to prevent forgetting in online continual learning scenarios.

Findings

01

Conditional rehearsal effectively reduces forgetting in non-stationary environments.

02

Partial parameter sharing balances learning flexibility and stability.

03

The method outperforms traditional approaches in online continual learning tasks.

Abstract

Catastrophic forgetting of connectionist neural networks is caused by the global sharing of parameters among all training examples. In this study, we analyze parameter sharing under the conditional computation framework where the parameters of a neural network are conditioned on each input example. At one extreme, if each input example uses a disjoint set of parameters, there is no sharing of parameters thus no catastrophic forgetting. At the other extreme, if the parameters are the same for every example, it reduces to the conventional neural network. We then introduce a clipped version of maxout networks which lies in the middle, i.e. parameters are shared partially among examples. Based on the parameter sharing analysis, we can locate a limited set of examples that are interfered when learning a new example. We propose to perform rehearsal on this set to prevent forgetting, which is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Machine Learning and ELM

MethodsMaxout