Improved Schemes for Episodic Memory-based Lifelong Learning
Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing

TL;DR
This paper introduces two improved episodic memory-based lifelong learning schemes, MEGA-I and MEGA-II, which better balance old and new tasks, addressing limitations of prior methods like GEM and A-GEM, and significantly enhance performance.
Contribution
The paper provides a unified optimization perspective on episodic memory approaches and proposes novel schemes that outperform existing methods on standard benchmarks.
Findings
MEGA-I and MEGA-II outperform GEM and A-GEM in lifelong learning tasks.
The new schemes reduce error by up to 18% on benchmarks.
They effectively balance old and new task learning through novel loss-updating rules.
Abstract
Current deep neural networks can achieve remarkable performance on a single task. However, when the deep neural network is continually trained on a sequence of tasks, it seems to gradually forget the previous learned knowledge. This phenomenon is referred to as \textit{catastrophic forgetting} and motivates the field called lifelong learning. Recently, episodic memory based approaches such as GEM \cite{lopez2017gradient} and A-GEM \cite{chaudhry2018efficient} have shown remarkable performance. In this paper, we provide the first unified view of episodic memory based approaches from an optimization's perspective. This view leads to two improved schemes for episodic memory based lifelong learning, called MEGA-I and MEGA-II. MEGA-I and MEGA-II modulate the balance between old tasks and the new task by integrating the current gradient with the gradient computed on the episodic memory.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
