Bilevel Continual Learning

Ammar Shaker; Francesco Alesiani; Shujian Yu; Wenzhe Yin

arXiv:2011.01168·cs.LG·November 3, 2020·1 cites

Bilevel Continual Learning

Ammar Shaker, Francesco Alesiani, Shujian Yu, Wenzhe Yin

PDF

Open Access

TL;DR

This paper introduces Bilevel Continual Learning (BiCL), a framework combining bilevel optimization and meta-learning to improve continual learning by reducing catastrophic forgetting in deep neural networks.

Contribution

It presents a novel bilevel optimization-based framework for continual learning that handles both discriminative and generative models in an online setting.

Findings

01

BiCL achieves competitive accuracy on current tasks.

02

BiCL reduces catastrophic forgetting effectively.

03

It extends continual learning to generative models.

Abstract

Continual learning (CL) studies the problem of learning a sequence of tasks, one at a time, such that the learning of each new task does not lead to the deterioration in performance on the previously seen ones while exploiting previously learned features. This paper presents Bilevel Continual Learning (BiCL), a general framework for continual learning that fuses bilevel optimization and recent advances in meta-learning for deep neural networks. BiCL is able to train both deep discriminative and generative models under the conservative setting of the online continual learning. Experimental results show that BiCL provides competitive performance in terms of accuracy for the current task while reducing the effect of catastrophic forgetting. This is a concurrent work with [1]. We submitted it to AAAI 2020 and IJCAI 2020. Now we put it on the arxiv for record. Different from [1], we also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition

MethodsCoresets