Joslim: Joint Widths and Weights Optimization for Slimmable Neural   Networks

Ting-Wu Chin; Ari S. Morcos; Diana Marculescu

arXiv:2007.11752·cs.LG·July 1, 2021

Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks

Ting-Wu Chin, Ari S. Morcos, Diana Marculescu

PDF

Open Access 2 Repos

TL;DR

This paper introduces Joslim, a novel framework for joint optimization of layer widths and weights in slimmable neural networks, significantly improving accuracy and efficiency across various models and datasets.

Contribution

The paper proposes a general joint optimization framework for slimmable networks, unifying and extending existing methods, and introduces Joslim, an algorithm that enhances performance by optimizing widths and weights simultaneously.

Findings

01

Up to 1.7% accuracy improvement on ImageNet for MobileNetV2.

02

Up to 8% reduction in memory footprint.

03

Outperforms existing slimmable network optimization methods.

Abstract

Slimmable neural networks provide a flexible trade-off front between prediction error and computational requirement (such as the number of floating-point operations or FLOPs) with the same storage requirement as a single model. They are useful for reducing maintenance overhead for deploying models to devices with different memory constraints and are useful for optimizing the efficiency of a system with many CNNs. However, existing slimmable network approaches either do not optimize layer-wise widths or optimize the shared-weights and layer-wise widths independently, thereby leaving significant room for improvement by joint width and weight optimization. In this work, we propose a general framework to enable joint optimization for both width configurations and weights of slimmable networks. Our framework subsumes conventional and NAS-based slimmable methods as special cases and provides…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsPointwise Convolution · Depthwise Convolution · ReLU6 · Sigmoid Activation · Depthwise Separable Convolution · Hard Swish · Average Pooling · Dense Connections · Squeeze-and-Excitation Block · *Communicated@Fast*How Do I Communicate to Expedia?