HEMP: High-order Entropy Minimization for neural network comPression

Enzo Tartaglione; St\'ephane Lathuili\`ere; Attilio Fiandrotti; Marco; Cagnazzo; Marco Grangetto

arXiv:2107.05298·cs.LG·July 13, 2021

HEMP: High-order Entropy Minimization for neural network comPression

Enzo Tartaglione, St\'ephane Lathuili\`ere, Attilio Fiandrotti, Marco, Cagnazzo, Marco Grangetto

PDF

TL;DR

HEMP introduces a differentiable, scalable entropy minimization method for neural network compression that enhances storage efficiency without sacrificing performance, compatible with various quantization schemes and other compression techniques.

Contribution

The paper presents a novel high-order entropy minimization technique that is differentiable, scalable, and agnostic to quantization schemes, improving neural network compression.

Findings

01

Outperforms similar methods in compression efficiency

02

Works synergistically with pruning and quantization approaches

03

Maintains model performance while reducing storage size

Abstract

We formulate the entropy of a quantized artificial neural network as a differentiable function that can be plugged as a regularization term into the cost function minimized by gradient descent. Our formulation scales efficiently beyond the first order and is agnostic of the quantization scheme. The network can then be trained to minimize the entropy of the quantized parameters, so that they can be optimally compressed via entropy coding. We experiment with our entropy formulation at quantizing and compressing well-known network architectures over multiple datasets. Our approach compares favorably over similar methods, enjoying the benefits of higher order entropy estimate, showing flexibility towards non-uniform quantization (we use Lloyd-max quantization), scalability towards any entropy order to be minimized and efficiency in terms of compression. We show that HEMP is able to work in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning