Astro: Activation-guided Structured Regularization for Outlier-Robust LLM Post-Training Quantization

Xi Chen; Ming Li; Junxi Li; Changsheng Li; Peisong Wang; Lizhong Ding; Ye Yuan; Guoren Wang

arXiv:2602.07596·cs.LG·February 10, 2026

Astro: Activation-guided Structured Regularization for Outlier-Robust LLM Post-Training Quantization

Xi Chen, Ming Li, Junxi Li, Changsheng Li, Peisong Wang, Lizhong Ding, Ye Yuan, Guoren Wang

PDF

Open Access

TL;DR

Astro is a novel regularization framework that improves post-training quantization of LLMs by suppressing outliers through activation-guided weight reconstruction, achieving high performance with zero inference latency.

Contribution

Astro introduces an activation-guided structured regularization method that effectively suppresses outliers in LLM weights during post-training quantization without adding inference latency.

Findings

01

Outperforms complex rotation methods in quantization accuracy.

02

Achieves nearly one-third of the quantization time of existing methods.

03

Maintains accuracy while eliminating inference latency.

Abstract

Weight-only post-training quantization (PTQ) is crucial for efficient Large Language Model (LLM) deployment but suffers from accuracy degradation caused by weight and activation outliers. Existing mitigation strategies often face critical limitations: they either yield insufficient outlier suppression or incur significant deployment inefficiencies, such as inference latency, heavy preprocessing, or reliance on complex operator fusion. To resolve these limitations, we leverage a key insight: over-parameterized LLMs often converge to Flat Minima, implying a vast equivalent solution space where weights can be adjusted without compromising accuracy. Building on this, we propose Astro, an Activation-guided Structured Regularization framework designed to suppress the negative effects of outliers in a hardware-friendly and efficient manner. Leveraging the activation-guided regularization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis