Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free
Euntae Choi, Sumin Song, Woosang Lim, Sungjoo Yoo

TL;DR
This paper introduces a training-free, rotation-based quantization method using Walsh-Hadamard transforms with sequency ordering, significantly reducing quantization error and improving LLM deployment efficiency at very low bit-widths.
Contribution
It proposes a novel Grouped Sequency-arranged Rotation (GSR) method that constructs effective rotation matrices without training, outperforming existing techniques at low bit-widths.
Findings
Improved quantization performance at 2-bit precision.
Robust reasoning task results and lower Perplexity scores.
Enhancement of existing learned rotation techniques.
Abstract
Large Language Models (LLMs) face deployment challenges due to high computational costs, and while Post-Training Quantization (PTQ) offers a solution, existing rotation-based methods struggle at very low bit-widths like 2-bit. We introduce a novel, training-free approach to construct an improved rotation matrix, addressing the limitations of current methods. The key contributions include leveraging the Walsh-Hadamard transform with sequency ordering, which clusters similar frequency components to reduce quantization error compared to standard Hadamard matrices, significantly improving performance. Furthermore, we propose a Grouped Sequency-arranged Rotation (GSR) using block-diagonal matrices with smaller Walsh blocks, effectively isolating outlier impacts and achieving performance comparable to optimization-based methods without requiring any training. Our method demonstrates robust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Advanced Neural Network Applications
