Deep Neural Network Compression with Single and Multiple Level   Quantization

Yuhui Xu; Yongzhuang Wang; Aojun Zhou; Weiyao Lin; Hongkai Xiong

arXiv:1803.03289·cs.LG·December 18, 2018·43 cites

Deep Neural Network Compression with Single and Multiple Level Quantization

Yuhui Xu, Yongzhuang Wang, Aojun Zhou, Weiyao Lin, Hongkai Xiong

PDF

Open Access 1 Repo

TL;DR

This paper introduces two novel neural network quantization methods, SLQ and MLQ, that effectively utilize depth information to achieve high- and low-bit compression, validated on popular architectures.

Contribution

The paper presents the first combined approach considering both width and depth levels in network quantization, improving compression and accuracy.

Findings

01

SLQ improves high-bit quantization accuracy.

02

MLQ achieves extremely low-bit (ternary) network compression.

03

Both methods outperform existing quantization techniques.

Abstract

Network quantization is an effective solution to compress deep neural networks for practical usage. Existing network quantization methods cannot sufficiently exploit the depth information to generate low-bit compressed network. In this paper, we propose two novel network quantization approaches, single-level network quantization (SLQ) for high-bit quantization and multi-level network quantization (MLQ) for extremely low-bit quantization (ternary).We are the first to consider the network quantization from both width and depth level. In the width level, parameters are divided into two parts: one for quantization and the other for re-training to eliminate the quantization loss. SLQ leverages the distribution of the parameters to improve the width level. In the depth level, we introduce incremental layer compensation to quantize layers iteratively which decreases the quantization loss in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuhuixu1993/SLQ
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Data Compression Techniques · Cancer-related molecular mechanisms research

Methods1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/