INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold
Changhun Lee, Hyungjun Kim, Eunhyeok Park, Jae-Joon Kim

TL;DR
INSTA-BNN introduces an instance-aware threshold mechanism for binary neural networks, dynamically adjusting quantization thresholds based on input statistics to improve accuracy with minimal overhead.
Contribution
The paper proposes a novel dynamic threshold adjustment method for BNNs using higher-order statistics, enhancing accuracy without significant computational cost.
Findings
Outperforms baseline by 3.0% and 2.8% on ImageNet
Achieves 68.5% and 72.2% top-1 accuracy on ResNet-18 and MobileNetV1
Maintains comparable computational cost
Abstract
Binary Neural Networks (BNNs) have emerged as a promising solution for reducing the memory footprint and compute costs of deep neural networks, but they suffer from quality degradation due to the lack of freedom as activations and weights are constrained to the binary values. To compensate for the accuracy drop, we propose a novel BNN design called Binary Neural Network with INSTAnce-aware threshold (INSTA-BNN), which controls the quantization threshold dynamically in an input-dependent or instance-aware manner. According to our observation, higher-order statistics can be a representative metric to estimate the characteristics of the input distribution. INSTA-BNN is designed to adjust the threshold dynamically considering various information, including higher-order statistics, but it is also optimized judiciously to realize minimal overhead on a real device. Our extensive study shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · COVID-19 diagnosis using AI
MethodsDepthwise Convolution · Pointwise Convolution · Average Pooling · Depthwise Separable Convolution · Batch Normalization · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Softmax · Global Average Pooling
