Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation
Yu-Shan Tai, Cheng-Yang Chang, Chieh-Fang Teng, and AnYeu (Andy) Wu

TL;DR
This paper introduces a learnable co-design system combining mixed-precision and dimension reduction techniques to compress activations in CNNs, significantly reducing memory usage and improving accuracy on edge devices.
Contribution
It proposes a novel dynamic search method for optimal bit-width allocation in activation compression, enhancing existing mixed-precision approaches.
Findings
Achieves 3.54%/1.27% accuracy improvement on ResNet18 and MobileNetv2.
Reduces bits per value by 0.18/2.02 compared to existing methods.
Demonstrates effective activation compression for resource-constrained edge deployment.
Abstract
Recently, deep convolutional neural networks (CNNs) have achieved many eye-catching results. However, deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference, i.e., activation. Existing research utilizes mixed-precision and dimension reduction to reduce computational complexity but pays less attention to its application for activation compression. To further exploit the redundancy in activation, we propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates specific compression policies according to their importance. In addition, the proposed dynamic searching technique enlarges search space and finds out the optimal bit-width allocation automatically. Our experimental results show that the proposed methods improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Domain Adaptation and Few-Shot Learning
