Linearity-based neural network compression

Silas Dobler; Florian Lemmerich

arXiv:2506.21146·cs.LG·June 27, 2025

Linearity-based neural network compression

Silas Dobler, Florian Lemmerich

PDF

Open Access

TL;DR

This paper introduces a novel neural network compression method based on the linearity of neurons with ReLU-like activations, achieving significant size reduction without loss of performance.

Contribution

It proposes a new linearity-based compression technique that complements existing importance-based methods, enabling more efficient neural network models.

Findings

01

Achieves lossless compression down to 25% of original size in most models.

02

Combining with importance-based pruning shows minimal interference.

03

Lays foundation for a new class of neural network compression methods.

Abstract

In neural network compression, most current methods reduce unnecessary parameters by measuring importance and redundancy. To augment already highly optimized existing solutions, we propose linearity-based compression as a novel way to reduce weights in a neural network. It is based on the intuition that with ReLU-like activation functions, neurons that are almost always activated behave linearly, allowing for merging of subsequent layers. We introduce the theory underlying this compression and evaluate our approach experimentally. Our novel method achieves a lossless compression down to 1/4 of the original model size in over the majority of tested models. Applying our method on already importance-based pruned models shows very little interference between different types of compression, demonstrating the option of successful combination of techniques. Overall, our work lays the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Neural Networks and Applications