Continual Learning with Dynamic Sparse Training: Exploring Algorithms   for Effective Model Updates

Murat Onur Yildirim; Elif Ceren Gok Yildirim; Ghada Sokar; Decebal; Constantin Mocanu; Joaquin Vanschoren

arXiv:2308.14831·cs.LG·December 5, 2023·2 cites

Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates

Murat Onur Yildirim, Elif Ceren Gok Yildirim, Ghada Sokar, Decebal, Constantin Mocanu, Joaquin Vanschoren

PDF

Open Access 1 Repo

TL;DR

This paper empirically investigates how different components of Dynamic Sparse Training affect continual learning performance, identifying optimal configurations for task-incremental learning on CIFAR100 and miniImageNet.

Contribution

It provides the first comprehensive analysis of DST components in continual learning, highlighting effective initialization and growth strategies for sparse networks.

Findings

01

ERK initialization is effective at low sparsity levels.

02

Uniform initialization is more reliable at high sparsity levels.

03

Adaptive DST components improve continual learning performance.

Abstract

Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and retain knowledge from a stream of data with as little computational overhead as possible. To this end; regularization, replay, architecture, and parameter isolation approaches were introduced to the literature. Parameter isolation using a sparse network which enables to allocate distinct parts of the neural network to different tasks and also allows to share of parameters between tasks if they are similar. Dynamic Sparse Training (DST) is a prominent way to find these sparse networks and isolate them for each task. This paper is the first empirical study investigating the effect of different DST components under the CL paradigm to fill a critical research gap and shed light on the optimal configuration of DST for CL if it exists. Therefore, we perform a comprehensive study in which we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

muratonuryildirim/cl-with-dst
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Machine Learning in Healthcare

MethodsDynamic Sparse Training · Focus