FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
Zhuguanyu Wu, Shihe Wang, Jiayi Zhang, Jiaxin Chen, Yunhong Wang

TL;DR
FIMA-Q introduces a novel post-training quantization method for Vision Transformers that leverages Fisher Information Matrix approximation to significantly improve accuracy, especially under low-bit quantization, without retraining.
Contribution
The paper proposes FIMA-Q, a new PTQ approach for ViTs that uses FIM approximation via DPLR-FIM and establishes a connection between KL divergence and FIM for efficient quantization loss computation.
Findings
Substantially improves accuracy over state-of-the-art PTQ methods.
Effective in low-bit quantization scenarios.
Validated across various vision tasks and ViT architectures.
Abstract
Post-training quantization (PTQ) has stood out as a cost-effective and promising model compression paradigm in recent years, as it avoids computationally intensive model retraining. Nevertheless, current PTQ methods for Vision Transformers (ViTs) still suffer from significant accuracy degradation, especially under low-bit quantization. To address these shortcomings, we analyze the prevailing Hessian-guided quantization loss, and uncover certain limitations of conventional Hessian approximations. By following the block-wise reconstruction framework, we propose a novel PTQ method for ViTs, dubbed FIMA-Q. Specifically, we firstly establish the connection between KL divergence and FIM, which enables fast computation of the quantization loss during reconstruction. We further propose an efficient FIM approximation method, namely DPLR-FIM, by employing the diagonal plus low-rank principle, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Target Detection Methodologies
