Continual Learning in Linear Classification on Separable Data

Itay Evron; Edward Moroshko; Gon Buzaglo; Maroun Khriesh; Badea; Marjieh; Nathan Srebro; Daniel Soudry

arXiv:2306.03534·cs.LG·June 7, 2023·1 cites

Continual Learning in Linear Classification on Separable Data

Itay Evron, Edward Moroshko, Gon Buzaglo, Maroun Khriesh, Badea, Marjieh, Nathan Srebro, Daniel Soudry

PDF

Open Access 1 Video

TL;DR

This paper provides a theoretical analysis of continual learning in linear classification tasks, showing how weak regularization leads to a sequential max-margin solution and deriving bounds on forgetting.

Contribution

It introduces a theoretical framework connecting continual learning with max-margin problems and offers bounds on forgetting in various task sequences.

Findings

01

Learning with weak regularization reduces to a sequential max-margin problem.

02

Upper bounds on forgetting are derived for cyclic and random task sequences.

03

Practical implications for regularization scheduling and weighting are discussed.

Abstract

We analyze continual learning on a sequence of separable linear classification tasks with binary labels. We show theoretically that learning with weak regularization reduces to solving a sequential max-margin problem, corresponding to a special case of the Projection Onto Convex Sets (POCS) framework. We then develop upper bounds on the forgetting and other quantities of interest under various settings with recurring tasks, including cyclic and random orderings of tasks. We discuss several practical implications to popular training practices like regularization scheduling and weighting. We point out several theoretical differences between our continual classification setting and a recently studied continual regression setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Continual Learning in Linear Classification on Separable Data· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning