FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization   Search

Jordan Dotzel; Gang Wu; Andrew Li; Muhammad Umar; Yun Ni; Mohamed S.; Abdelfattah; Zhiru Zhang; Liqun Cheng; Martin G. Dixon; Norman P. Jouppi,; Quoc V. Le; Sheng Li

arXiv:2308.03290·cs.CV·May 2, 2024·1 cites

FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search

Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S., Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi,, Quoc V. Le, Sheng Li

PDF

Open Access

TL;DR

This paper introduces FLIQS, a one-shot mixed-precision quantization search method that finds optimal quantization configurations for neural networks without retraining, improving accuracy and efficiency over prior methods.

Contribution

The paper presents the first one-shot mixed-precision quantization search that eliminates retraining, applicable to both integer and floating-point models, and extends to joint architecture and quantization search.

Findings

01

Improves ResNet-18 accuracy by 1.31% on ImageNet.

02

Enhances MobileNetV2 FP8 models by up to 0.98%.

03

Achieves 2.69% higher accuracy with similar model cost in joint search.

Abstract

Quantization has become a mainstream compression technique for reducing model size, computational requirements, and energy consumption for modern deep neural networks (DNNs). With improved numerical support in recent hardware, including multiple variants of integer and floating point, mixed-precision quantization has become necessary to achieve high-quality results with low model cost. Prior mixed-precision methods have performed either a post-training quantization search, which compromises on accuracy, or a differentiable quantization search, which leads to high memory usage from branching. Therefore, we propose the first one-shot mixed-precision quantization search that eliminates the need for retraining in both integer and low-precision floating point models. We evaluate our search (FLIQS) on multiple convolutional and vision transformer networks to discover Pareto-optimal models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsMulti-Head Attention · Attention Is All You Need · Pointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Batch Normalization · Softmax · Inverted Residual Block · Linear Layer · Dense Connections