Learnable Mixed-precision and Dimension Reduction Co-design for   Low-storage Activation

Yu-Shan Tai; Cheng-Yang Chang; Chieh-Fang Teng; and AnYeu (Andy) Wu

arXiv:2207.07931·eess.IV·July 20, 2022

Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation

Yu-Shan Tai, Cheng-Yang Chang, Chieh-Fang Teng, and AnYeu (Andy) Wu

PDF

Open Access

TL;DR

This paper introduces a learnable co-design system combining mixed-precision and dimension reduction techniques to compress activations in CNNs, significantly reducing memory usage and improving accuracy on edge devices.

Contribution

It proposes a novel dynamic search method for optimal bit-width allocation in activation compression, enhancing existing mixed-precision approaches.

Findings

01

Achieves 3.54%/1.27% accuracy improvement on ResNet18 and MobileNetv2.

02

Reduces bits per value by 0.18/2.02 compared to existing methods.

03

Demonstrates effective activation compression for resource-constrained edge deployment.

Abstract

Recently, deep convolutional neural networks (CNNs) have achieved many eye-catching results. However, deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference, i.e., activation. Existing research utilizes mixed-precision and dimension reduction to reduce computational complexity but pays less attention to its application for activation compression. To further exploit the redundancy in activation, we propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates specific compression policies according to their importance. In addition, the proposed dynamic searching technique enlarges search space and finds out the optimal bit-width allocation automatically. Our experimental results show that the proposed methods improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Domain Adaptation and Few-Shot Learning