Class-incremental Learning via Deep Model Consolidation

Junting Zhang; Jie Zhang; Shalini Ghosh; Dawei Li; Serafettin Tasci,; Larry Heck; Heming Zhang; C.-C. Jay Kuo

arXiv:1903.07864·cs.CV·January 17, 2020·29 cites

Class-incremental Learning via Deep Model Consolidation

Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci,, Larry Heck, Heming Zhang, C.-C. Jay Kuo

PDF

Open Access 2 Repos

TL;DR

This paper introduces Deep Model Consolidation, a method for class-incremental learning that effectively combines models trained on old and new classes using unlabeled auxiliary data, avoiding catastrophic forgetting.

Contribution

It proposes a novel double distillation training approach that consolidates separate models without needing original training data, improving incremental learning performance.

Findings

01

Outperforms state-of-the-art methods on CIFAR-100 and CUB-200 image classification tasks.

02

Achieves superior results in object detection on PASCAL VOC 2007.

03

Effective even when original training data is unavailable.

Abstract

Deep neural networks (DNNs) often suffer from "catastrophic forgetting" during incremental learning (IL) --- an abrupt degradation of performance on the original set of classes when the training objective is adapted to a newly added set of classes. Existing IL approaches tend to produce a model that is biased towards either the old classes or new classes, unless with the help of exemplars of the old data. To address this issue, we propose a class-incremental learning paradigm called Deep Model Consolidation (DMC), which works well even when the original training data is not available. The idea is to first train a separate model only for the new classes, and then combine the two individual models trained on data of two distinct set of classes (old classes and new classes) via a novel double distillation training objective. The two existing models are consolidated by exploiting publicly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications