Learning to Prompt for Continual Learning

Zifeng Wang; Zizhao Zhang; Chen-Yu Lee; Han Zhang; Ruoxi Sun; Xiaoqi; Ren; Guolong Su; Vincent Perot; Jennifer Dy; Tomas Pfister

arXiv:2112.08654·cs.LG·March 23, 2022·5 cites

Learning to Prompt for Continual Learning

Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi, Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

PDF

Open Access 5 Repos

TL;DR

This paper introduces a novel continual learning approach called Learning to Prompt (L2P), which uses dynamic prompts to adapt a pre-trained model to new tasks without relying on task labels or rehearsal buffers.

Contribution

L2P is a new paradigm that trains a succinct memory of prompts to manage task-invariant and task-specific knowledge, outperforming prior methods without needing task identity at test time.

Findings

01

L2P outperforms state-of-the-art methods on image classification benchmarks.

02

L2P performs competitively against rehearsal-based methods without using a rehearsal buffer.

03

L2P is effective in task-agnostic continual learning scenarios.

Abstract

The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge. Typical methods rely on a rehearsal buffer or known task identity at test time to retrieve learned knowledge and address forgetting, while this work presents a new paradigm for continual learning that aims to train a more succinct memory system without accessing task identity at test time. Our method learns to dynamically prompt (L2P) a pre-trained model to learn tasks sequentially under different task transitions. In our proposed framework, prompts are small learnable parameters, which are maintained in a memory space. The objective is to optimize prompts to instruct the model prediction and explicitly manage task-invariant and task-specific knowledge while maintaining model plasticity. We conduct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications