Continual Learners are Incremental Model Generalizers

Jaehong Yoon; Sung Ju Hwang; Yue Cao

arXiv:2306.12026·cs.LG·June 22, 2023·1 cites

Continual Learners are Incremental Model Generalizers

Jaehong Yoon, Sung Ju Hwang, Yue Cao

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that continual learning models serve as effective pre-trainers by gradually enhancing task-general representations, leading to improved transferability and a sustainable learning framework that bridges pre-training and fine-tuning.

Contribution

It introduces a new unsupervised continual learning framework with masked modeling and a fine-tuning scheme called GLAD to preserve task-generic features.

Findings

01

CL models improve transfer quality gradually without degrading fine-tuning performance

02

The proposed GLAD scheme maintains rich task-generic representations during fine-tuning

03

Pre-trained CL models achieve competitive results and serve as strong starting points for downstream tasks

Abstract

Motivated by the efficiency and rapid convergence of pre-trained models for solving downstream tasks, this paper extensively studies the impact of Continual Learning (CL) models as pre-trainers. In both supervised and unsupervised CL, we find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance. This is because CL models can learn improved task-general features when easily forgetting task-specific knowledge. Based on this observation, we suggest a new unsupervised CL framework with masked modeling, which aims to capture fluent task-generic representation during training. Furthermore, we propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks. The model fine-tuned with GLAD achieves competitive performance and can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Continual Learners are Incremental Model Generalizers· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications