TL;DR
This paper reveals that GPTQ, a popular LLM quantization method, is mathematically equivalent to Babai's nearest plane algorithm, providing a geometric interpretation and enabling improved quantization techniques.
Contribution
It establishes a theoretical link between GPTQ and Babai's algorithm, leading to new error bounds and enhanced quantization methods for large language models.
Findings
GPTQ is equivalent to Babai's nearest plane algorithm for linear layers.
The geometric interpretation allows for error bounds and improved quantization avoiding clipping.
The paper provides GPU inference kernels for the new quantization approach.
Abstract
Quantizing the weights of large language models (LLMs) from 16-bit to lower bitwidth is the de facto approach to deploy massive transformers onto more affordable accelerators. While GPTQ emerged as one of the standard methods for one-shot post-training quantization at LLM scale, its inner workings are described as a sequence of algebraic updates that obscure geometric meaning or worst-case guarantees. In this work, we show that, when executed back-to-front (from the last to first dimension) for a linear layer, GPTQ is mathematically identical to Babai's nearest plane algorithm for the classical closest vector problem (CVP) on a lattice defined by the Hessian matrix of the layer's inputs. This equivalence is based on a sophisticated mathematical argument, and has two analytical consequences: first, the GPTQ error propagation step gains an intuitive geometric interpretation; second, GPTQ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
