Lightweight compression of neural network feature tensors for   collaborative intelligence

Robert A. Cohen; Hyomin Choi; Ivan V. Baji\'c

arXiv:2105.06002·cs.LG·May 14, 2021

Lightweight compression of neural network feature tensors for collaborative intelligence

Robert A. Cohen, Hyomin Choi, Ivan V. Baji\'c

PDF

TL;DR

This paper introduces a lightweight, low-complexity compression method for neural network activations in collaborative intelligence, enabling efficient edge-cloud DNN split deployment with minimal accuracy loss.

Contribution

It proposes a novel compression technique for DNN activations that requires no retraining and is optimized for edge devices, outperforming traditional codecs like HEVC.

Findings

01

Achieved 0.6 to 0.8 bits per activation with less than 1% accuracy loss

02

Outperformed HEVC in inference accuracy by up to 1.3%

03

Applicable to popular object detection and classification DNNs

Abstract

In collaborative intelligence applications, part of a deep neural network (DNN) is deployed on a relatively low-complexity device such as a mobile phone or edge device, and the remainder of the DNN is processed where more computing resources are available, such as in the cloud. This paper presents a novel lightweight compression technique designed specifically to code the activations of a split DNN layer, while having a low complexity suitable for edge devices and not requiring any retraining. We also present a modified entropy-constrained quantizer design algorithm optimized for clipped activations. When applied to popular object-detection and classification DNNs, we were able to compress the 32-bit floating point activations down to 0.6 to 0.8 bits, while keeping the loss in accuracy to less than 1%. When compared to HEVC, we found that the lightweight codec consistently provided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.