Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling
Tsendsuren Munkhdalai

TL;DR
This paper introduces Sparse Meta Networks, a meta-learning approach that enables deep neural networks to perform fast online sequential adaptation, inspired by human memory systems, with applications demonstrated in adaptive language modeling.
Contribution
The paper proposes Sparse Meta Networks that generate sparse fast-weights for online adaptation, improving continual learning in neural networks.
Findings
Strong performance in online reinforcement learning scenarios.
Effective large-scale adaptive language modeling.
Sparse fast-weights facilitate continual adaptation.
Abstract
Training a deep neural network requires a large amount of single-task data and involves a long time-consuming optimization phase. This is not scalable to complex, realistic environments with new unexpected changes. Humans can perform fast incremental learning on the fly and memory systems in the brain play a critical role. We introduce Sparse Meta Networks -- a meta-learning approach to learn online sequential adaptation algorithms for deep neural networks, by using deep neural networks. We augment a deep neural network with a layer-specific fast-weight memory. The fast-weights are generated sparsely at each time step and accumulated incrementally through time providing a useful inductive bias for online continual adaptation. We demonstrate strong performance on a variety of sequential adaptation scenarios, from a simple online reinforcement learning to a large scale adaptive language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Speech Recognition and Synthesis
