Cyclic Sparse Training: Is it Enough?

Advait Gadhikar; Sree Harsha Nelaturu; Rebekka Burkholz

arXiv:2406.02773·cs.LG·June 10, 2024

Cyclic Sparse Training: Is it Enough?

Advait Gadhikar, Sree Harsha Nelaturu, Rebekka Burkholz

PDF

Open Access

TL;DR

This paper investigates cyclic sparse training and proposes SCULPT-ing, a method that enhances sparse network training by coupling parameters and masks through repeated cyclic training and pruning, matching state-of-the-art performance efficiently.

Contribution

It challenges existing hypotheses by showing cyclic training improves optimization and introduces SCULPT-ing, a new method that reduces computational cost while maintaining high sparsity performance.

Findings

01

Cyclic training boosts pruning at initialization.

02

Repeated cyclic training explores the loss landscape better.

03

SCULPT-ing matches state-of-the-art performance at high sparsity.

Abstract

The success of iterative pruning methods in achieving state-of-the-art sparse networks has largely been attributed to improved mask identification and an implicit regularization induced by pruning. We challenge this hypothesis and instead posit that their repeated cyclic training schedules enable improved optimization. To verify this, we show that pruning at initialization is significantly boosted by repeated cyclic training, even outperforming standard iterative pruning methods. The dominant mechanism how this is achieved, as we conjecture, can be attributed to a better exploration of the loss landscape leading to a lower training loss. However, at high sparsity, repeated cyclic training alone is not enough for competitive performance. A strong coupling between learnt parameter initialization and mask seems to be required. Standard methods obtain this coupling via expensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInnovative Teaching Methods

MethodsPruning