Compressing Deep Convolutional Neural Networks by Stacking Low-dimensional Binary Convolution Filters
Weichao Lan, Liang Lan

TL;DR
This paper introduces a novel CNN compression method that stacks low-dimensional binary filters, significantly increasing compression ratios while maintaining accuracy, suitable for deployment on memory-limited devices.
Contribution
The paper proposes a new model compression technique using stacked low-dimensional binary filters, surpassing existing binary CNN limits and enabling efficient training and deployment.
Findings
Achieves higher compression ratios than existing binary CNNs.
Maintains comparable accuracy with significantly reduced memory usage.
Provides theoretical and empirical validation of the method.
Abstract
Deep Convolutional Neural Networks (CNN) have been successfully applied to many real-life problems. However, the huge memory cost of deep CNN models poses a great challenge of deploying them on memory-constrained devices (e.g., mobile phones). One popular way to reduce the memory cost of deep CNN model is to train binary CNN where the weights in convolution filters are either 1 or -1 and therefore each weight can be efficiently stored using a single bit. However, the compression ratio of existing binary CNN models is upper bounded by around 32. To address this limitation, we propose a novel method to compress deep CNN model by stacking low-dimensional binary convolution filters. Our proposed method approximates a standard convolution filter by selecting and stacking filters from a set of low-dimensional binary convolution filters. This set of low-dimensional binary convolution filters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
MethodsConvolution
