The Quest of Finding the Antidote to Sparse Double Descent

Victor Qu\'etu; Marta Milovanovi\'c

arXiv:2308.16596·cs.AI·September 1, 2023

The Quest of Finding the Antidote to Sparse Double Descent

Victor Qu\'etu, Marta Milovanovi\'c

PDF

Open Access

TL;DR

This paper investigates the sparse double descent phenomenon in deep learning models, proposing regularization techniques including knowledge distillation to avoid performance deterioration due to sparsity.

Contribution

It introduces a novel learning scheme with knowledge distillation to effectively mitigate sparse double descent in deep models.

Findings

01

L2 regularization can reduce sparse double descent but affects sparsity-performance trade-off.

02

Knowledge distillation effectively prevents sparse double descent without sacrificing sparsity.

03

Experimental results confirm the proposed method's effectiveness in image classification tasks.

Abstract

In energy-efficient schemes, finding the optimal size of deep learning models is very important and has a broad impact. Meanwhile, recent studies have reported an unexpected phenomenon, the sparse double descent: as the model's sparsity increases, the performance first worsens, then improves, and finally deteriorates. Such a non-monotonic behavior raises serious questions about the optimal model's size to maintain high performance: the model needs to be sufficiently over-parametrized, but having too many parameters wastes training resources. In this paper, we aim to find the best trade-off efficiently. More precisely, we tackle the occurrence of the sparse double descent and present some solutions to avoid it. Firstly, we show that a simple $ℓ_{2}$ regularization method can help to mitigate this phenomenon but sacrifices the performance/sparsity compromise. To overcome this problem,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Machine Learning and Algorithms