Improving Representational Continuity via Continued Pretraining

Michael Sun; Ananya Kumar; Divyam Madaan; Percy Liang

arXiv:2302.13289·cs.LG·February 28, 2023·1 cites

Improving Representational Continuity via Continued Pretraining

Michael Sun, Ananya Kumar, Divyam Madaan, Percy Liang

PDF

Open Access 1 Repo

TL;DR

This paper investigates continual representation learning, revealing that a transfer learning method called LP-FT outperforms traditional continual learning techniques in practical scenarios, achieving strong results across multiple benchmarks.

Contribution

It demonstrates that LP-FT, a transfer learning approach, surpasses existing continual learning methods in real-world settings and standard benchmarks, simplifying the process.

Findings

01

LP-FT outperforms naive training and other continual learning methods.

02

Strong continual learning baselines perform worse than naive training with standard adaptation.

03

LP-FT achieves state-of-the-art results on an NLP continual learning benchmark.

Abstract

We consider the continual representation learning setting: sequentially pretrain a model $M^{'}$ on tasks $T_{1}, \dots, T_{T}$ , and then adapt $M^{'}$ on a small amount of data from each task $T_{i}$ to check if it has forgotten information from old tasks. Under a kNN adaptation protocol, prior work shows that continual learning methods improve forgetting over naive training (SGD). In reality, practitioners do not use kNN classifiers -- they use the adaptation method that works best (e.g., fine-tuning) -- here, we find that strong continual learning baselines do worse than naive training. Interestingly, we find that a method from the transfer learning community (LP-FT) outperforms naive training and the other continual learning methods. Even with standard kNN evaluation protocols, LP-FT performs comparably with strong continual learning methods (while being simpler and requiring less memory) on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shiningsunnyday/ucl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning