Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
Christopher Subia-Waud, Srinandan Dasmahapatra

TL;DR
This paper introduces a probabilistic Bayesian approach to weight-sharing quantization in neural networks, improving accuracy and compressibility by considering weight uncertainties and positions, outperforming existing methods on large models.
Contribution
It presents a novel Bayesian neural network-based framework with a new initialization and regularization for effective weight quantization considering weight uncertainties and positions.
Findings
Achieves 1.6% higher top-1 accuracy on ImageNet with DeiT-Tiny.
Reduces weights to 296 unique values from over 5 million.
Demonstrates superior compressibility and accuracy over state-of-the-art methods.
Abstract
Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods for weight-sharing quantization often make assumptions about the treatment of weights based on value alone that neglect the unique role weight position plays. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster centre and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialisation setting and a regularisation term which allow for the training of BNNs under complex dataset-model combinations. By leveraging the flexibility of weight values captured through a probability distribution, we enhance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Generative Adversarial Networks and Image Synthesis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Batch Normalization · Kaiming Initialization · Residual Connection · Residual Block · Global Average Pooling · 1x1 Convolution · Bottleneck Residual Block · Max Pooling
