Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks
Ting-Wu Chin, Ari S. Morcos, Diana Marculescu

TL;DR
This paper introduces Joslim, a novel framework for joint optimization of layer widths and weights in slimmable neural networks, significantly improving accuracy and efficiency across various models and datasets.
Contribution
The paper proposes a general joint optimization framework for slimmable networks, unifying and extending existing methods, and introduces Joslim, an algorithm that enhances performance by optimizing widths and weights simultaneously.
Findings
Up to 1.7% accuracy improvement on ImageNet for MobileNetV2.
Up to 8% reduction in memory footprint.
Outperforms existing slimmable network optimization methods.
Abstract
Slimmable neural networks provide a flexible trade-off front between prediction error and computational requirement (such as the number of floating-point operations or FLOPs) with the same storage requirement as a single model. They are useful for reducing maintenance overhead for deploying models to devices with different memory constraints and are useful for optimizing the efficiency of a system with many CNNs. However, existing slimmable network approaches either do not optimize layer-wise widths or optimize the shared-weights and layer-wise widths independently, thereby leaving significant room for improvement by joint width and weight optimization. In this work, we propose a general framework to enable joint optimization for both width configurations and weights of slimmable networks. Our framework subsumes conventional and NAS-based slimmable methods as special cases and provides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
MethodsPointwise Convolution · Depthwise Convolution · ReLU6 · Sigmoid Activation · Depthwise Separable Convolution · Hard Swish · Average Pooling · Dense Connections · Squeeze-and-Excitation Block · *Communicated@Fast*How Do I Communicate to Expedia?
