LegoNet: Memory Footprint Reduction Through Block Weight Clustering
Joseph Bingham, Noah Green, Saman Zonouz

TL;DR
LegoNet is a novel weight clustering method that significantly reduces neural network memory footprint without retraining, by grouping weights into blocks and clustering these blocks, achieving over 64x compression with no accuracy loss.
Contribution
LegoNet introduces block-based weight clustering for neural network compression that requires no retraining or fine-tuning, enabling substantial memory savings.
Findings
Achieved 64x compression of ResNet-50 with no accuracy loss.
Found an arrangement of 16 blocks yielding 128x compression with minimal accuracy impact.
No retraining or data needed for compression process.
Abstract
As the need for neural network-based applications to become more accurate and powerful grows, so too does their size and memory footprint. With embedded devices, whose cache and RAM are limited, this growth hinders their ability to leverage state-of-the-art neural network architectures. In this work, we propose \textbf{LegoNet}, a compression technique that \textbf{constructs blocks of weights of the entire model regardless of layer type} and clusters these induced blocks. Using blocks instead of individual values to cluster the weights, we were able to compress ResNet-50 trained for Cifar-10 and ImageNet with only 32 4x4 blocks, compressing the memory footprint by over a factor of \textbf{64x without having to remove any weights} or changing the architecture and \textbf{no loss to accuracy}, nor retraining or any data, and show how to find an arrangement of 16 4x4 blocks that gives a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Parallel Computing and Optimization Techniques
