Scalable Compression of Deep Neural Networks

Xing Wang; Jie Liang

arXiv:1608.07365·cs.CV·August 29, 2016

Scalable Compression of Deep Neural Networks

Xing Wang, Jie Liang

PDF

Open Access

TL;DR

This paper introduces a scalable neural network compression method that allows adjustable bit rates for deployment on resource-constrained devices, enabling efficient updates and minimal performance loss.

Contribution

It proposes a hierarchical weight quantization and adaptive bit allocation technique for scalable neural network compression with fine-tuning.

Findings

01

Achieves scalable compression with graceful performance degradation

02

Enables incremental updates by reusing low-rate networks

03

Maintains competitive accuracy with reduced storage requirements

Abstract

Deep neural networks generally involve some layers with mil- lions of parameters, making them difficult to be deployed and updated on devices with limited resources such as mobile phones and other smart embedded systems. In this paper, we propose a scalable representation of the network parameters, so that different applications can select the most suitable bit rate of the network based on their own storage constraints. Moreover, when a device needs to upgrade to a high-rate network, the existing low-rate network can be reused, and only some incremental data are needed to be downloaded. We first hierarchically quantize the weights of a pre-trained deep neural network to enforce weight sharing. Next, we adaptively select the bits assigned to each layer given the total bit budget. After that, we retrain the network to fine-tune the quantized centroids. Experimental results show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods