Soft Weight-Sharing for Neural Network Compression

Karen Ullrich; Edward Meeds; Max Welling

arXiv:1702.04008·stat.ML·May 10, 2017·81 cites

Soft Weight-Sharing for Neural Network Compression

Karen Ullrich, Edward Meeds, Max Welling

PDF

Open Access 3 Repos

TL;DR

This paper introduces a simple soft weight-sharing method for neural network compression that combines quantization and pruning in a single training process, inspired by the MDL principle.

Contribution

It presents a novel, unified approach to neural network compression using soft weight-sharing, simplifying the process compared to existing multi-step pipelines.

Findings

01

Achieves competitive compression rates with fewer training steps

02

Simultaneously performs quantization and pruning

03

Provides insights into the relation between compression and MDL

Abstract

The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices. This however, conflicts with their computationally, memory and energy intense nature, leading to a growing interest in compression. Recent work by Han et al. (2015a) propose a pipeline that involves retraining, pruning and quantization of neural network weights, obtaining state-of-the-art compression rates. In this paper, we show that competitive compression rates can be achieved by using a version of soft weight-sharing (Nowlan & Hinton, 1992). Our method achieves both quantization and pruning in one simple (re-)training procedure. This point of view also exposes the relation between compression and the minimum description length (MDL) principle.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis

MethodsPruning