CoreQ: Learning-Free Mismatch Correction and Successive Rounding for Quantization

Seohyeon Cha; Huancheng Chen; Dongjun Kim; Haoran Zhang; Kevin Chan; Gustavo de Veciana; Haris Vikalo

arXiv:2602.05902·cs.LG·May 12, 2026

CoreQ: Learning-Free Mismatch Correction and Successive Rounding for Quantization

Seohyeon Cha, Huancheng Chen, Dongjun Kim, Haoran Zhang, Kevin Chan, Gustavo de Veciana, Haris Vikalo

PDF

TL;DR

CoreQ is a learning-free post-training quantization framework that adaptively corrects layer mismatch errors in large language models, improving performance without overfitting or hyperparameter tuning.

Contribution

It introduces a closed-form mismatch correction coefficient based on geometric decomposition, enabling adaptive, hyperparameter-free calibration for quantized models.

Findings

01

CoreQ outperforms strong PTQ baselines across multiple LLMs and quantization settings.

02

The method reduces overfitting by avoiding fixed mismatch correction scaling.

03

CoreQ improves perplexity and downstream accuracy in experiments.

Abstract

Post-training quantization (PTQ) enables efficient deployment of large language models by mapping pretrained weights to low-bit formats without retraining, typically using a small calibration set to minimize a layer-wise calibration objective. However, this sequential procedure induces a mismatch: errors from earlier quantized layers alter the inputs received by later layers, causing the activations to deviate from those of the full-precision model. Recent approaches introduce mismatch-aware calibration objectives to compensate for this effect, but leave open how much of the observed mismatch should shift each layer's calibration target. Fully applying this correction can overfit limited calibration data, while scaling the mismatch correction with a fixed coefficient ignores varying reliability of mismatch estimates across layers. To address these limitations, we propose CoreQ, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.