Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Lianbo Ma, Jianlun Ma, Yuee Zhou, Guoyang Xie, Qiang He, Zhichao Lu

TL;DR
This paper introduces a novel mixed-precision quantization method that generalizes policies from small to large datasets using sharpness-aware optimization, reducing computational costs and maintaining accuracy.
Contribution
The paper proposes a new approach combining sharpness-aware minimization and adaptive gradient alignment to efficiently transfer quantization policies from small to large datasets.
Findings
Achieved ImageNet accuracy with 0.5% of training data
Reduced computational cost by up to 150% compared to baselines
Validated effectiveness through theoretical analysis and experiments
Abstract
Mixed Precision Quantization (MPQ) has become an essential technique for optimizing neural network by determining the optimal bitwidth per layer. Existing MPQ methods, however, face a major hurdle: they require a computationally expensive search for quantization policies on large-scale datasets. To resolve this issue, we introduce a novel approach that first searches for quantization policies on small datasets and then generalizes them to large-scale datasets. This approach simplifies the process, eliminating the need for large-scale quantization fine-tuning and only necessitating model weight adjustment. Our method is characterized by three key techniques: sharpness-aware minimization for enhanced quantization generalization, implicit gradient direction alignment to handle gradient conflicts among different optimization objectives, and an adaptive perturbation radius to accelerate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
MethodsSharpness-Aware Minimization
