GAQAT: gradient-adaptive quantization-aware training for domain   generalization

Jiacheng Jiang; Yuan Meng; Chen Tang; Han Yu; Qun Li; Zhi Wang; Wenwu; Zhu

arXiv:2412.05551·cs.CV·December 10, 2024

GAQAT: gradient-adaptive quantization-aware training for domain generalization

Jiacheng Jiang, Yuan Meng, Chen Tang, Han Yu, Qun Li, Zhi Wang, Wenwu, Zhu

PDF

Open Access

TL;DR

This paper introduces GAQAT, a novel low-precision training method that stabilizes quantized models for better domain generalization by addressing gradient conflicts in quantization scales.

Contribution

The paper proposes a gradient-adaptive quantization-aware training framework that stabilizes low-precision models for domain generalization, addressing gradient conflicts in quantization scales.

Findings

01

GAQAT improves 3-bit and 4-bit models' performance on PACS by up to 4.5%.

02

4-bit GAQAT achieves near-lossless performance on DomainNet, surpassing SOTA QAT.

03

Stabilizing gradient conflicts enhances low-precision models' out-of-domain generalization.

Abstract

Research on loss surface geometry, such as Sharpness-Aware Minimization (SAM), shows that flatter minima improve generalization. Recent studies further reveal that flatter minima can also reduce the domain generalization (DG) gap. However, existing flatness-based DG techniques predominantly operate within a full-precision training process, which is impractical for deployment on resource-constrained edge devices that typically rely on lower bit-width representations (e.g., 4 bits, 3 bits). Consequently, low-precision quantization-aware training is critical for optimizing these techniques in real-world applications. In this paper, we observe a significant degradation in performance when applying state-of-the-art DG-SAM methods to quantized models, suggesting that current approaches fail to preserve generalizability during the low-precision training process. To address this limitation, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning

MethodsSharpness-Aware Minimization