Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel, Rana Ali Amjad, Mart van Baalen, Christos Louizos,, Tijmen Blankevoort

TL;DR
AdaRound introduces an adaptive, data-driven weight rounding method for post-training neural network quantization, significantly improving accuracy without fine-tuning and requiring minimal unlabelled data.
Contribution
The paper presents AdaRound, a novel adaptive rounding technique that optimizes weight quantization based on task loss, outperforming traditional methods.
Findings
AdaRound outperforms nearest rounding significantly.
Enables 4-bit quantization of ResNet models with minimal accuracy loss.
Establishes new state-of-the-art results for post-training quantization.
Abstract
When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss. AdaRound is fast, does not require fine-tuning of the network, and only uses a small amount of unlabelled data. We start by theoretically analyzing the rounding problem for a pre-trained neural network. By approximating the task loss with a Taylor series expansion, the rounding task is posed as a quadratic unconstrained binary optimization problem. We simplify this to a layer-wise local loss and propose to optimize this loss with a soft relaxation. AdaRound not only outperforms rounding-to-nearest by a significant margin but also establishes a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
