S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks
Shiyu Liu, Chong Min John Tan, Mehul Motani

TL;DR
This paper introduces S-Cyc, a novel learning rate schedule that adapts during iterative pruning of ReLU networks, leading to improved performance by increasing the LR upper bound as the network becomes sparser.
Contribution
The paper proposes S-Cyc, an adaptive cyclical learning rate schedule that adjusts the max_lr in an S-shape during pruning, outperforming existing schedules across multiple networks and datasets.
Findings
S-Cyc outperforms benchmark LR schedules by 2.1%-3.4%.
S-Cyc achieves performance close to an oracle with grid-tuned max_lr.
Gradient distribution narrows as networks are pruned, justifying larger LR.
Abstract
We explore a new perspective on adapting the learning rate (LR) schedule to improve the performance of the ReLU-based network as it is iteratively pruned. Our work and contribution consist of four parts: (i) We find that, as the ReLU-based network is iteratively pruned, the distribution of weight gradients tends to become narrower. This leads to the finding that as the network becomes more sparse, a larger value of LR should be used to train the pruned network. (ii) Motivated by this finding, we propose a novel LR schedule, called S-Cyclical (S-Cyc) which adapts the conventional cyclical LR schedule by gradually increasing the LR upper bound (max_lr) in an S-shape as the network is iteratively pruned.We highlight that S-Cyc is a method agnostic LR schedule that applies to many iterative pruning methods. (iii) We evaluate the performance of the proposed S-Cyc and compare it to four LR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Efficient Wireless Sensor Networks · Software-Defined Networks and 5G · Wireless Body Area Networks
MethodsPruning · Visual Geometry Group 19 Layer CNN
