# Entropy-Constrained Training of Deep Neural Networks

**Authors:** Simon Wiedemann, Arturo Marban, Klaus-Robert M\"uller, Wojciech Samek

arXiv: 1812.07520 · 2018-12-20

## TL;DR

This paper introduces an entropy-based framework for neural network compression, unifying various techniques under a single optimization approach and achieving state-of-the-art results with significant compression gains.

## Contribution

It formalizes neural network compression as an entropy-constrained optimization problem and provides a gradient-based method for effective compression.

## Key findings

- Achieved up to 71x compression on VGG-like networks
- Unified multiple compression techniques under a single entropy minimization framework
- Demonstrated state-of-the-art compression results across architectures and datasets

## Abstract

We propose a general framework for neural network compression that is motivated by the Minimum Description Length (MDL) principle. For that we first derive an expression for the entropy of a neural network, which measures its complexity explicitly in terms of its bit-size. Then, we formalize the problem of neural network compression as an entropy-constrained optimization objective. This objective generalizes many of the compression techniques proposed in the literature, in that pruning or reducing the cardinality of the weight elements of the network can be seen special cases of entropy-minimization techniques. Furthermore, we derive a continuous relaxation of the objective, which allows us to minimize it using gradient based optimization techniques. Finally, we show that we can reach state-of-the-art compression results on different network architectures and data sets, e.g. achieving x71 compression gains on a VGG-like architecture.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.07520/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1812.07520/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1812.07520/full.md

---
Source: https://tomesphere.com/paper/1812.07520