Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

Yunshan Zhong; Yuyao Zhou; Yuxin Zhang; Wanchen Sui; Shen Li; Yong Li; Fei Chao; Rongrong Ji

arXiv:2412.16553·cs.CV·November 3, 2025

Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

Yunshan Zhong, Yuyao Zhou, Yuxin Zhang, Wanchen Sui, Shen Li, Yong Li, Fei Chao, Rongrong Ji

PDF

Open Access

TL;DR

This paper introduces SARDFQ, a novel data-free quantization method for Vision Transformers that enhances semantic quality of synthetic images through alignment and reinforcement, significantly improving quantization accuracy.

Contribution

SARDFQ is the first to combine semantic alignment and reinforcement techniques specifically for data-free quantization of Vision Transformers, addressing key semantic issues.

Findings

01

Improves top-1 accuracy on ImageNet by 15.52% for W4A4 ViT-B.

02

Effectively reduces semantic distortion and inadequacy in synthetic images.

03

Outperforms existing data-free quantization methods significantly.

Abstract

Data-free quantization (DFQ) enables model quantization without accessing real data, addressing concerns regarding data security and privacy. With the growing adoption of Vision Transformers (ViTs), DFQ for ViTs has garnered significant attention. However, existing DFQ methods exhibit two limitations: (1) semantic distortion, where the semantics of synthetic images deviate substantially from those of real images, and (2) semantic inadequacy, where synthetic images contain extensive regions with limited content and oversimplified textures, leading to suboptimal quantization performance. To address these limitations, we propose SARDFQ, a novel Semantics Alignment and Reinforcement Data-Free Quantization method for ViTs. To address semantic distortion, SARDFQ incorporates Attention Priors Alignment (APA), which optimizes synthetic images to follow randomly generated structure attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Image Processing Techniques and Applications

MethodsSoftmax · Attention Is All You Need