Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel; Rana Ali Amjad; Mart van Baalen; Christos Louizos,; Tijmen Blankevoort

arXiv:2004.10568·cs.LG·July 1, 2020·55 cites

Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel, Rana Ali Amjad, Mart van Baalen, Christos Louizos,, Tijmen Blankevoort

PDF

Open Access 2 Models 1 Video

TL;DR

AdaRound introduces an adaptive, data-driven weight rounding method for post-training neural network quantization, significantly improving accuracy without fine-tuning and requiring minimal unlabelled data.

Contribution

The paper presents AdaRound, a novel adaptive rounding technique that optimizes weight quantization based on task loss, outperforming traditional methods.

Findings

01

AdaRound outperforms nearest rounding significantly.

02

Enables 4-bit quantization of ResNet models with minimal accuracy loss.

03

Establishes new state-of-the-art results for post-training quantization.

Abstract

When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss. AdaRound is fast, does not require fine-tuning of the network, and only uses a small amount of unlabelled data. We start by theoretically analyzing the rounding problem for a pre-trained neural network. By approximating the task loss with a Taylor series expansion, the rounding task is posed as a quadratic unconstrained binary optimization problem. We simplify this to a layer-wise local loss and propose to optimize this loss with a soft relaxation. AdaRound not only outperforms rounding-to-nearest by a significant margin but also establishes a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

Up or Down? Adaptive Rounding for Post-Training Quantization· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning