OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks

Jingyang Xiang; Zuohui Chen; Siqi Li; Qing Wu; Yong Liu

arXiv:2407.05257·cs.CV·July 9, 2024

OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks

Jingyang Xiang, Zuohui Chen, Siqi Li, Qing Wu, Yong Liu

PDF

Open Access 1 Repo

TL;DR

This paper identifies the problem of silent weights in binary neural networks that hinder training efficiency and accuracy, and proposes OvSW with adaptive gradient scaling and silence awareness decay to improve weight sign updates and model performance.

Contribution

The paper introduces OvSW, a novel method that enhances weight sign updates in BNNs by addressing silent weights, leading to faster convergence and state-of-the-art accuracy.

Findings

01

OvSW achieves 61.6% top-1 accuracy on ImageNet with ResNet18.

02

OvSW outperforms existing BNN methods on CIFAR10 and ImageNet.

03

Silent weights constitute over 50% of weights and slow down training.

Abstract

Binary Neural Networks~(BNNs) have been proven to be highly effective for deploying deep neural networks on mobile and embedded platforms. Most existing works focus on minimizing quantization errors, improving representation ability, or designing gradient approximations to alleviate gradient mismatch in BNNs, while leaving the weight sign flipping, a critical factor for achieving powerful BNNs, untouched. In this paper, we investigate the efficiency of weight sign updates in BNNs. We observe that, for vanilla BNNs, over 50\% of the weights remain their signs unchanged during training, and these weights are not only distributed at the tails of the weight distribution but also universally present in the vicinity of zero. We refer to these weights as ``silent weights'', which slow down convergence and lead to a significant accuracy degradation. Theoretically, we reveal this is due to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JingyangXiang/OvSW
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsFocus