LegoNet: Memory Footprint Reduction Through Block Weight Clustering

Joseph Bingham; Noah Green; Saman Zonouz

arXiv:2603.06606·cs.LG·March 10, 2026

LegoNet: Memory Footprint Reduction Through Block Weight Clustering

Joseph Bingham, Noah Green, Saman Zonouz

PDF

Open Access

TL;DR

LegoNet is a novel weight clustering method that significantly reduces neural network memory footprint without retraining, by grouping weights into blocks and clustering these blocks, achieving over 64x compression with no accuracy loss.

Contribution

LegoNet introduces block-based weight clustering for neural network compression that requires no retraining or fine-tuning, enabling substantial memory savings.

Findings

01

Achieved 64x compression of ResNet-50 with no accuracy loss.

02

Found an arrangement of 16 blocks yielding 128x compression with minimal accuracy impact.

03

No retraining or data needed for compression process.

Abstract

As the need for neural network-based applications to become more accurate and powerful grows, so too does their size and memory footprint. With embedded devices, whose cache and RAM are limited, this growth hinders their ability to leverage state-of-the-art neural network architectures. In this work, we propose \textbf{LegoNet}, a compression technique that \textbf{constructs blocks of weights of the entire model regardless of layer type} and clusters these induced blocks. Using blocks instead of individual values to cluster the weights, we were able to compress ResNet-50 trained for Cifar-10 and ImageNet with only 32 4x4 blocks, compressing the memory footprint by over a factor of \textbf{64x without having to remove any weights} or changing the architecture and \textbf{no loss to accuracy}, nor retraining or any data, and show how to find an arrangement of 16 4x4 blocks that gives a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Parallel Computing and Optimization Techniques