The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Jiale Chen; Yalda Shabanzadeh; Elvir Crn\v{c}evi\'c; Torsten Hoefler; Dan Alistarh

arXiv:2507.18553·cs.LG·May 15, 2026

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Jiale Chen, Yalda Shabanzadeh, Elvir Crn\v{c}evi\'c, Torsten Hoefler, Dan Alistarh

PDF

1 Repo 1 Video

TL;DR

This paper reveals that GPTQ, a popular LLM quantization method, is mathematically equivalent to Babai's nearest plane algorithm, providing a geometric interpretation and enabling improved quantization techniques.

Contribution

It establishes a theoretical link between GPTQ and Babai's algorithm, leading to new error bounds and enhanced quantization methods for large language models.

Findings

01

GPTQ is equivalent to Babai's nearest plane algorithm for linear layers.

02

The geometric interpretation allows for error bounds and improved quantization avoiding clipping.

03

The paper provides GPU inference kernels for the new quantization approach.

Abstract

Quantizing the weights of large language models (LLMs) from 16-bit to lower bitwidth is the de facto approach to deploy massive transformers onto more affordable accelerators. While GPTQ emerged as one of the standard methods for one-shot post-training quantization at LLM scale, its inner workings are described as a sequence of algebraic updates that obscure geometric meaning or worst-case guarantees. In this work, we show that, when executed back-to-front (from the last to first dimension) for a linear layer, GPTQ is mathematically identical to Babai's nearest plane algorithm for the classical closest vector problem (CVP) on a lattice defined by the Hessian matrix of the layer's inputs. This equivalence is based on a sophisticated mathematical argument, and has two analytical consequences: first, the GPTQ error propagation step gains an intuitive geometric interpretation; second, GPTQ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IST-DASLab/GPTQ-Babai
github

Videos

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm· slideslive