Model compression as constrained optimization, with application to   neural nets. Part II: quantization

Miguel \'A. Carreira-Perpi\~n\'an; Yerlan Idelbayev

arXiv:1707.04319·cs.LG·July 17, 2017·23 cites

Model compression as constrained optimization, with application to neural nets. Part II: quantization

Miguel \'A. Carreira-Perpi\~n\'an, Yerlan Idelbayev

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new iterative method for neural network weight quantization that guarantees convergence to a local optimum, enabling higher compression rates with minimal loss.

Contribution

It proposes a model compression framework as constrained optimization, with an iterative learning-compression algorithm that ensures convergence and supports various quantization schemes.

Findings

01

Achieves higher compression rates than previous methods.

02

Maintains negligible loss degradation at 1-bit quantization.

03

Supports adaptive and fixed codebook schemes.

Abstract

We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal. The codebook can be optimally learned jointly with the net, or fixed, as for binarization or ternarization approaches. Previous work has quantized the weights of the reference net, or incorporated rounding operations in the backpropagation algorithm, but this has no guarantee of converging to a loss-optimal, quantized net. We describe a new approach based on the recently proposed framework of model compression as constrained optimization \citep{Carreir17a}. This results in a simple iterative "learning-compression" algorithm, which alternates a step that learns a net of continuous weights with a step that quantizes (or binarizes/ternarizes) the weights,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

UCMerced-ML/LC-model-compression
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Data Compression Techniques · Generative Adversarial Networks and Image Synthesis