Modulating Regularization Frequency for Efficient Compression-Aware   Model Training

Dongsoo Lee; Se Jung Kwon; Byeongwook Kim; Jeongin Yun; Baeseong Park,; Yongkweon Jeon

arXiv:2105.01875·cs.LG·May 6, 2021

Modulating Regularization Frequency for Efficient Compression-Aware Model Training

Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Jeongin Yun, Baeseong Park,, Yongkweon Jeon

PDF

Open Access

TL;DR

This paper proposes a novel regularization technique called regularization frequency, which optimizes the timing of compression during training to improve the efficiency and accuracy of compression-aware neural network training.

Contribution

It introduces regularization frequency as a new parameter to control compression regularization strength, enhancing training efficiency and model performance.

Findings

01

Regularization frequency significantly impacts model accuracy.

02

Combining regularization frequency with compression ratio improves training outcomes.

03

Occasional compression can match or outperform frequent compression.

Abstract

While model compression is increasingly important because of large neural network size, compression-aware training is challenging as it needs sophisticated model modifications and longer training time.In this paper, we introduce regularization frequency (i.e., how often compression is performed during training) as a new regularization technique for a practical and efficient compression-aware training method. For various regularization techniques, such as weight decay and dropout, optimizing the regularization strength is crucial to improve generalization in Deep Neural Networks (DNNs). While model compression also demands the right amount of regularization, the regularization strength incurred by model compression has been controlled only by compression ratio. Throughout various experiments, we show that regularization frequency critically affects the regularization strength of model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification

MethodsWeight Decay