Neural Networks with Quantization Constraints

Ignacio Hounie; Juan Elenter; Alejandro Ribeiro

arXiv:2210.15623·cs.LG·October 28, 2022·1 cites

Neural Networks with Quantization Constraints

Ignacio Hounie, Juan Elenter, Alejandro Ribeiro

PDF

Open Access 1 Repo

TL;DR

This paper introduces a constrained optimization approach for quantization aware training of neural networks, enabling efficient low-precision models with minimal performance loss by leveraging dual variables for layer sensitivity analysis.

Contribution

It formulates quantization as a constrained optimization problem that avoids gradient approximations and uses dual variables for layer sensitivity, improving mixed precision quantization.

Findings

01

Competitive accuracy in image classification tasks

02

Layer sensitivity analysis guides effective quantization

03

Significant performance gains with mixed precision quantization

Abstract

Enabling low precision implementations of deep learning models, without considerable performance degradation, is necessary in resource and latency constrained settings. Moreover, exploiting the differences in sensitivity to quantization across layers can allow mixed precision implementations to achieve a considerably better computation performance trade-off. However, backpropagating through the quantization operation requires introducing gradient approximations, and choosing which layers to quantize is challenging for modern architectures due to the large search space. In this work, we present a constrained learning approach to quantization aware training. We formulate low precision supervised learning as a constrained optimization problem, and show that despite its non-convexity, the resulting problem is strongly dual and does away with gradient estimations. Furthermore, we show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ihounie/pd-qat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsAttentive Walk-Aggregating Graph Neural Network