Universal and Succinct Source Coding of Deep Neural Networks
Sourya Basu, Lav R. Varshney

TL;DR
This paper introduces a universal lossless compression method for deep neural networks that exploits layer permutation invariance, enabling efficient storage and inference without full decompression.
Contribution
It presents a novel universal compression algorithm for neural network weights leveraging permutation invariance, achieving near-entropy bounds and enabling direct inference.
Findings
Achieves near-entropy bound compression rates.
Enables inference directly on compressed models.
Demonstrates effectiveness on standard datasets.
Abstract
Deep neural networks have shown incredible performance for inference tasks in a variety of domains. Unfortunately, most current deep networks are enormous cloud-based structures that require significant storage space, which limits scaling of deep learning as a service (DLaaS) and use for on-device intelligence. This paper is concerned with finding universal lossless compressed representations of deep feedforward networks with synaptic weights drawn from discrete sets, and directly performing inference without full decompression. The basic insight that allows less rate than naive approaches is recognizing that the bipartite graph layers of feedforward networks have a kind of permutation invariance to the labeling of nodes, in terms of inferential operation. We provide efficient algorithms to dissipate this irrelevant uncertainty and then use arithmetic coding to nearly achieve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · CCD and CMOS Imaging Sensors · Neural Networks and Applications
