DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning
Yujia Tong, Jingling Yuan, Tian Zhang, Jianquan Liu, Chuang Hu

TL;DR
This paper introduces DFQ-ViT, a data-free quantization method for Vision Transformers that synthesizes high-quality data and aligns activations, achieving near data-driven performance without fine-tuning.
Contribution
It proposes a novel pipeline for data-free quantization of ViTs, including difficulty-based sample synthesis and activation correction, improving performance without fine-tuning.
Findings
DFQ-ViT outperforms existing data-free methods.
Achieves performance comparable to real-data quantization.
Reduces computational overhead and deployment barriers.
Abstract
Data-Free Quantization (DFQ) enables the quantization of Vision Transformers (ViTs) without requiring access to data, allowing for the deployment of ViTs on devices with limited resources. In DFQ, the quantization model must be calibrated using synthetic samples, making the quality of these synthetic samples crucial. Existing methods fail to fully capture and balance the global and local features within the samples, resulting in limited synthetic data quality. Moreover, we have found that during inference, there is a significant difference in the distributions of intermediate layer activations between the quantized and full-precision models. These issues lead to a severe performance degradation of the quantized model. To address these problems, we propose a pipeline for Data-Free Quantization for Vision Transformers (DFQ-ViT). Specifically, we synthesize samples in order of increasing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
