CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs

Zhe Ding; Su Pan; Duowei Pan

arXiv:2604.26378·cs.LG·April 30, 2026

CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs

Zhe Ding, Su Pan, Duowei Pan

PDF

1 Repo

TL;DR

CoQuant introduces a joint weight-activation subspace projection method for mixed-precision quantization of LLMs, improving inference efficiency while maintaining accuracy by modeling output error with both weight and activation noise.

Contribution

It presents a theoretically grounded closed-form weighted PCA approach that optimally balances weight and activation covariances for better low-bit LLM quantization.

Findings

01

Outperforms existing PTQ methods on Llama-3.2 and Qwen2.5 models.

02

Achieves lower perplexity and higher reasoning accuracy.

03

Provides a principled approach for joint weight-activation subspace modeling.

Abstract

Post-training quantization (PTQ) has become an important technique for reducing the inference cost of Large Language Models (LLMs). While recent mixed-precision methods improve ultra-low bit quantization by preserving critical subspaces in high precision, they typically construct these subspaces relying solely on activation statistics. This ignores the fundamental nature of linear operations, where the output perturbation is jointly driven by both activation and weight quantization noise. In this paper, we propose CoQuant, a joint weight-activation subspace projection method. By theoretically modeling the expected output error, CoQuant formulates a closed-form weighted PCA solution that balances activation and weight covariances to select the optimal high-precision subspace. Extensive experiments on Llama-3.2 and Qwen2.5 models show that CoQuant consistently outperforms strong PTQ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zachary5895/CoQuant
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.