Zero-Shot Quantization via Weight-Space Arithmetic
Daniele Solombrino, Antonio Andrea Gargiulo, Alessandro Zirilli, Luca Zhou, Adrian Robert Minut, Emanuele Rodol\`a

TL;DR
This paper introduces a zero-shot method for improving post-training quantization accuracy by transferring a quantization vector in weight space, enabling low-cost, data-free quantization across diverse models and tasks.
Contribution
The authors propose a novel weight-space arithmetic technique to extract and transfer quantization robustness, significantly enhancing low-bit quantization without receiver training data.
Findings
Up to 60-point accuracy improvement in 3-bit quantization.
Method works across multiple ViT scales and diverse image classification tasks.
Quantization vectors are well-defined and free from reparameterization issues.
Abstract
We show that robustness to post-training quantization (PTQ) is a transferable direction in weight space. We call this direction the quantization vector: extracted from a donor task by simple weight-space arithmetic, it can be used to patch a receiver model and improve post-PTQ Top-1 accuracy by up to 60 points in a 3-bit setting, without receiver-side quantization-aware training (QAT). Because the method requires no receiver training data, it provides a zero-shot, low-cost alternative to QAT for extremely low-bit deployment. Across four ViT scales and 22 image classification tasks, donor quantization vectors often yield substantial gains even when donor and receiver tasks differ markedly. We further prove rigorously that quantization vectors are well-defined and do not suffer from reparameterization symmetries, and provide a local geometric account of their effect. Together, these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
