SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization

Zhixiong Zhao; Fangxin Liu; Junjie Wang; Chenyang Guan; Zongwu Wang; Li Jiang; Haibing Guan

arXiv:2511.11663·cs.LG·April 9, 2026

SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization

Zhixiong Zhao, Fangxin Liu, Junjie Wang, Chenyang Guan, Zongwu Wang, Li Jiang, Haibing Guan

PDF

1 Repo

TL;DR

SpecQuant introduces a spectral decomposition-based ultra-low-bit quantization method for LLMs, significantly reducing memory and computation with minimal accuracy loss.

Contribution

It presents a novel two-stage spectral quantization framework with runtime adaptive truncation for ultra-low-bit LLM deployment.

Findings

01

Achieves 4-bit quantization on LLaMA-3 8B with only 1.5% accuracy loss.

02

Doubles inference speed and reduces memory usage by threefold.

03

Effectively suppresses high-frequency noise, improving quantization robustness.

Abstract

The emergence of accurate open large language models (LLMs) has sparked a push for advanced quantization techniques to enable efficient deployment on end-user devices. In this paper, we revisit the challenge of extreme LLM compression -- targeting ultra-low-bit quantization for both activations and weights -- from a Fourier frequency domain perspective. We propose SpecQuant, a two-stage framework that tackles activation outliers and cross-channel variance. In the first stage, activation outliers are smoothed and transferred into the weight matrix to simplify downstream quantization. In the second stage, we apply channel-wise low-frequency Fourier truncation to suppress high-frequency components while preserving essential signal energy, improving quantization robustness. Our method builds on the principle that most of the weight energy is concentrated in low-frequency components, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Kishon-zzx/SpecQuant
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.