Exploiting Kernel Compression on BNNs
Franyell Silfa, Jose Maria Arnau, Antonio Gonz\'alez

TL;DR
This paper introduces a compression technique for Binary Neural Networks (BNNs) that leverages Huffman encoding and clustering to reduce storage and memory access, leading to improved performance on mobile CPUs.
Contribution
It proposes a novel compression scheme for BNNs using Huffman encoding and clustering, along with hardware support for efficient decoding on mobile CPUs.
Findings
Memory requirement reduced by 1.32x
Performance improved by 1.35x
Effective on ImageNet with ReAacNet model
Abstract
Binary Neural Networks (BNNs) are showing tremendous success on realistic image classification tasks. Notably, their accuracy is similar to the state-of-the-art accuracy obtained by full-precision models tailored to edge devices. In this regard, BNNs are very amenable to edge devices since they employ 1-bit to store the inputs and weights, and thus, their storage requirements are low. Also, BNNs computations are mainly done using xnor and pop-counts operations which are implemented very efficiently using simple hardware structures. Nonetheless, supporting BNNs efficiently on mobile CPUs is far from trivial since their benefits are hindered by frequent memory accesses to load weights and inputs. In BNNs, a weight or an input is stored using one bit, and aiming to increase storage and computation efficiency, several of them are packed together as a sequence of bits. In this work, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Advanced Image and Video Retrieval Techniques
