CoreQ: Learning-Free Mismatch Correction and Successive Rounding for Quantization
Seohyeon Cha, Huancheng Chen, Dongjun Kim, Haoran Zhang, Kevin Chan, Gustavo de Veciana, Haris Vikalo

TL;DR
CoreQ is a learning-free post-training quantization framework that adaptively corrects layer mismatch errors in large language models, improving performance without overfitting or hyperparameter tuning.
Contribution
It introduces a closed-form mismatch correction coefficient based on geometric decomposition, enabling adaptive, hyperparameter-free calibration for quantized models.
Findings
CoreQ outperforms strong PTQ baselines across multiple LLMs and quantization settings.
The method reduces overfitting by avoiding fixed mismatch correction scaling.
CoreQ improves perplexity and downstream accuracy in experiments.
Abstract
Post-training quantization (PTQ) enables efficient deployment of large language models by mapping pretrained weights to low-bit formats without retraining, typically using a small calibration set to minimize a layer-wise calibration objective. However, this sequential procedure induces a mismatch: errors from earlier quantized layers alter the inputs received by later layers, causing the activations to deviate from those of the full-precision model. Recent approaches introduce mismatch-aware calibration objectives to compensate for this effect, but leave open how much of the observed mismatch should shift each layer's calibration target. Fully applying this correction can overfit limited calibration data, while scaling the mismatch correction with a fixed coefficient ignores varying reliability of mismatch estimates across layers. To address these limitations, we propose CoreQ, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
