BayesQ: Uncertainty-Guided Bayesian Quantization

Ismail Lamaakal; Chaymae Yahyati; Yassine Maleh; Khalid El Makkaoui; Ibrahim Ouahbi

arXiv:2511.08821·cs.LG·November 13, 2025

BayesQ: Uncertainty-Guided Bayesian Quantization

Ismail Lamaakal, Chaymae Yahyati, Yassine Maleh, Khalid El Makkaoui, Ibrahim Ouahbi

PDF

Open Access

TL;DR

BayesQ introduces an uncertainty-guided Bayesian post-training quantization method that optimizes quantization under the posterior expected loss, leading to improved model accuracy with minimal additional preprocessing.

Contribution

It is the first to optimize quantization using a Bayesian posterior framework, incorporating uncertainty to enhance post-training quantization performance.

Findings

01

Outperforms strong PTQ baselines on ResNet-50 and BERT-base.

02

Achieves up to +1.5% top-1 accuracy on ImageNet.

03

Requires comparable preprocessing to existing methods.

Abstract

We present BayesQ, an uncertainty-guided post-training quantization framework that is the first to optimize quantization under the posterior expected loss. BayesQ fits a lightweight Gaussian posterior over weights (diagonal Laplace by default; optional K-FAC/low-rank), whitens by the posterior covariance, designs codebooks to minimize posterior-expected distortion, and allocates mixed precision via a greedy knapsack that maximizes marginal expected-loss reduction per bit under a global budget. For scalar quantizers, posterior-expected MSE yields closed-form tables; task-aware proxies are handled by short Monte Carlo on a small calibration set. An optional calibration-only distillation aligns the quantized model with the posterior predictive teacher. At matched average bits/weight of 3.0/3.5/4.0, BayesQ improves over strong PTQ baselines on ResNet-50 (ImageNet) and BERT-base (GLUE) e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Data Compression Techniques · Adversarial Robustness in Machine Learning