Adaptive Regularization for Sparsity Control in Bregman-Based Optimizers
Ahmad Aloradi, Tim Roith, Emanu\"el A. P. Habets, Daniel Tenbrinck

TL;DR
This paper introduces an adaptive regularization method for Bregman-based optimizers that effectively controls sparsity levels in neural network training, reducing the need for trial-and-error parameter tuning.
Contribution
The authors propose an adaptive regularization scheme that dynamically adjusts the regularization parameter to reliably achieve target sparsity levels in deep neural networks.
Findings
The adaptive method reliably achieves sparsity targets between 75% and 99%.
It converges faster than non-adaptive baselines during early training.
The scheme improves out-of-distribution robustness over dense baselines.
Abstract
Sparse training reduces the memory and computational costs of deep neural networks. However, sparse optimization methods, e.g., those adding an penalty, often control sparsity only indirectly through a regularization parameter , whose mapping to the final sparsity rate is non-trivial. In our experiments, we found this parameter sensitivity to be particularly pronounced for Bregman-based optimizers. Specifically, the two variants LinBreg and AdaBreg reach the same sparsity at values that differ by up to two orders of magnitude, requiring expensive trial-and-error sweeps to achieve a user-specified sparsity. To address this, we propose an adaptive regularization scheme that updates based on the difference between the model's current sparsity and the target sparsity. We analyze the resulting algorithm and evaluate it on automatic speaker verification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
