Chunking: Continual Learning is not just about Distribution Shift

Thomas L. Lee; Amos Storkey

arXiv:2310.02206·cs.LG·July 12, 2024

Chunking: Continual Learning is not just about Distribution Shift

Thomas L. Lee, Amos Storkey

PDF

Open Access 1 Repo

TL;DR

This paper highlights the importance of the chunking sub-problem in continual learning, showing it accounts for significant performance drops and is currently unaddressed by existing algorithms, thereby limiting overall CL effectiveness.

Contribution

The paper identifies chunking as a critical and neglected sub-problem in continual learning, demonstrating its impact and proposing that addressing it can improve CL performance.

Findings

01

Chunking accounts for about half of the performance drop in CL.

02

Current CL algorithms perform no better than SGD without distribution shift.

03

Addressing chunking improves performance transfer to shifted data scenarios.

Abstract

Work on continual learning (CL) has thus far largely focused on the problems arising from shifts in the data distribution. However, CL can be decomposed into two sub-problems: (a) shifts in the data distribution, and (b) dealing with the fact that the data is split into chunks and so only a part of the data is available to be trained on at any point in time. In this work, we look at the latter sub-problem, the chunking of data. We show that chunking is an important part of CL, accounting for around half of the performance drop from offline learning in our experiments. Furthermore, our results reveal that current CL algorithms do not address the chunking sub-problem, only performing as well as plain SGD training when there is no shift in the data distribution. Therefore, we show that chunking is both an important and currently unaddressed sub-problem and until it is addressed CL methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tlee43/chunking-setting
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning

MethodsStochastic Gradient Descent