Target Layer Regularization for Continual Learning Using Cramer-Wold Generator
Marcin Mazur, {\L}ukasz Pustelnik, Szymon Knop, Patryk Pagacz,, Przemys{\l}aw Spurek

TL;DR
This paper introduces CW-TaLaR, a regularization method for continual learning that uses Cramer-Wold distance to preserve neural network target layer distributions without needing past data, showing competitive results.
Contribution
The paper presents CW-TaLaR, a novel regularization approach utilizing Cramer-Wold distance for effective continual learning without storing previous datasets.
Findings
Proves effectiveness of CW-TaLaR across multiple supervised frameworks.
Outperforms several existing state-of-the-art continual learning models.
Does not require remembering previous task datasets.
Abstract
We propose an effective regularization strategy (CW-TaLaR) for solving continual learning problems. It uses a penalizing term expressed by the Cramer-Wold distance between two probability distributions defined on a target layer of an underlying neural network that is shared by all tasks, and the simple architecture of the Cramer-Wold generator for modeling output data representation. Our strategy preserves target layer distribution while learning a new task but does not require remembering previous tasks' datasets. We perform experiments involving several common supervised frameworks, which prove the competitiveness of the CW-TaLaR method in comparison to a few existing state-of-the-art continual learning models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
