OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks
Jingyang Xiang, Zuohui Chen, Siqi Li, Qing Wu, Yong Liu

TL;DR
This paper identifies the problem of silent weights in binary neural networks that hinder training efficiency and accuracy, and proposes OvSW with adaptive gradient scaling and silence awareness decay to improve weight sign updates and model performance.
Contribution
The paper introduces OvSW, a novel method that enhances weight sign updates in BNNs by addressing silent weights, leading to faster convergence and state-of-the-art accuracy.
Findings
OvSW achieves 61.6% top-1 accuracy on ImageNet with ResNet18.
OvSW outperforms existing BNN methods on CIFAR10 and ImageNet.
Silent weights constitute over 50% of weights and slow down training.
Abstract
Binary Neural Networks~(BNNs) have been proven to be highly effective for deploying deep neural networks on mobile and embedded platforms. Most existing works focus on minimizing quantization errors, improving representation ability, or designing gradient approximations to alleviate gradient mismatch in BNNs, while leaving the weight sign flipping, a critical factor for achieving powerful BNNs, untouched. In this paper, we investigate the efficiency of weight sign updates in BNNs. We observe that, for vanilla BNNs, over 50\% of the weights remain their signs unchanged during training, and these weights are not only distributed at the tails of the weight distribution but also universally present in the vicinity of zero. We refer to these weights as ``silent weights'', which slow down convergence and lead to a significant accuracy degradation. Theoretically, we reveal this is due to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
