Range Asymmetric Numeral Systems-Based Lightweight Intermediate Feature Compression for Split Computing of Deep Neural Networks
Mingyu Sung, Suhwan Im, Vikas Palakonda, Jae-Mo Kang

TL;DR
This paper introduces a lightweight, efficient feature compression method using rANS encoding combined with quantization and sparsity, significantly reducing communication overhead in split neural network computing while maintaining accuracy.
Contribution
It presents a novel, distribution-agnostic compression framework that exploits tensor sparsity and includes a theoretical model for optimizing compression, with GPU acceleration for real-time performance.
Findings
Consistently maintains near-baseline accuracy on CIFAR100 and ImageNet.
Effective across diverse neural architectures including vision and NLP models.
Achieves sub-millisecond encoding/decoding latency with minimal computational overhead.
Abstract
Split computing distributes deep neural network inference between resource-constrained edge devices and cloud servers but faces significant communication bottlenecks when transmitting intermediate features. To this end, in this paper, we propose a novel lightweight compression framework that leverages Range Asymmetric Numeral Systems (rANS) encoding with asymmetric integer quantization and sparse tensor representation to reduce transmission overhead dramatically. Specifically, our approach combines asymmetric integer quantization with a sparse representation technique, eliminating the need for complex probability modeling or network modifications. The key contributions include: (1) a distribution-agnostic compression pipeline that exploits inherent tensor sparsity to achieve bandwidth reduction with minimal computational overhead; (2) an approximate theoretical model that optimizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis
