Continual Learning in Linear Classification on Separable Data
Itay Evron, Edward Moroshko, Gon Buzaglo, Maroun Khriesh, Badea, Marjieh, Nathan Srebro, Daniel Soudry

TL;DR
This paper provides a theoretical analysis of continual learning in linear classification tasks, showing how weak regularization leads to a sequential max-margin solution and deriving bounds on forgetting.
Contribution
It introduces a theoretical framework connecting continual learning with max-margin problems and offers bounds on forgetting in various task sequences.
Findings
Learning with weak regularization reduces to a sequential max-margin problem.
Upper bounds on forgetting are derived for cyclic and random task sequences.
Practical implications for regularization scheduling and weighting are discussed.
Abstract
We analyze continual learning on a sequence of separable linear classification tasks with binary labels. We show theoretically that learning with weak regularization reduces to solving a sequential max-margin problem, corresponding to a special case of the Projection Onto Convex Sets (POCS) framework. We then develop upper bounds on the forgetting and other quantities of interest under various settings with recurring tasks, including cyclic and random orderings of tasks. We discuss several practical implications to popular training practices like regularization scheduling and weighting. We point out several theoretical differences between our continual classification setting and a recently studied continual regression setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
