DASH: Warm-Starting Neural Network Training in Stationary Settings   without Loss of Plasticity

Baekrok Shin; Junsoo Oh; Hanseul Cho; Chulhee Yun

arXiv:2410.23495·cs.LG·November 4, 2024

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

Baekrok Shin, Junsoo Oh, Hanseul Cho, Chulhee Yun

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces DASH, a method to mitigate plasticity loss in warm-started neural networks trained on stationary data, by selectively forgetting noise and preserving learned features, leading to better accuracy and efficiency.

Contribution

The paper develops a framework to understand plasticity loss in warm-started neural networks and proposes DASH, a novel method to address this issue in stationary settings.

Findings

01

DASH improves test accuracy on vision tasks.

02

DASH enhances training efficiency.

03

Noise memorization is identified as a key cause of plasticity loss.

Abstract

Warm-starting neural network training by initializing networks with previously learned weights is appealing, as practical neural networks are often deployed under a continuous influx of new data. However, it often leads to loss of plasticity, where the network loses its ability to learn new information, resulting in worse generalization than training from scratch. This occurs even under stationary data distributions, and its underlying mechanism is poorly understood. We develop a framework emulating real-world neural network training and identify noise memorization as the primary cause of plasticity loss when warm-starting on stationary data. Motivated by this, we propose Direction-Aware SHrinking (DASH), a method aiming to mitigate plasticity loss by selectively forgetting memorized noise while preserving learned features. We validate our approach on vision tasks, demonstrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity· slideslive

Taxonomy

TopicsNeural Networks and Applications