Self-Compressing Neural Networks

Szabolcs Cs\'efalvay; James Imber

arXiv:2301.13142·cs.LG·June 18, 2025·1 cites

Self-Compressing Neural Networks

Szabolcs Cs\'efalvay, James Imber

PDF

Open Access 1 Repo

TL;DR

This paper introduces Self-Compression, a method to significantly reduce neural network size by removing redundant weights and decreasing bit precision, enabling efficient training and inference without specialized hardware.

Contribution

The paper presents a novel, general approach that simultaneously compresses weights and reduces bit precision using a generalized loss function.

Findings

01

Achieves floating point accuracy with only 3% of bits remaining

02

Removes 82% of weights while maintaining accuracy

03

Reduces network size substantially without hardware modifications

Abstract

This work focuses on reducing neural network size, which is a major driver of neural network execution time, power consumption, bandwidth, and memory footprint. A key challenge is to reduce size in a manner that can be exploited readily for efficient training and inference without the need for specialized hardware. We propose Self-Compression: a simple, general method that simultaneously achieves two goals: (1) removing redundant weights, and (2) reducing the number of bits required to represent the remaining weights. This is achieved using a generalized loss function to minimize overall network size. In our experiments we demonstrate floating point accuracy with as few as 3% of the bits and 18% of the weights remaining in the network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

benearnthof/self_compressing
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Parallel Computing and Optimization Techniques