Only relative ranks matter in weight-clustered large language models

Borja Aizpurua; Sukhbinder Singh; Rom\'an Or\'us

arXiv:2603.17917·cs.LG·March 19, 2026

Only relative ranks matter in weight-clustered large language models

Borja Aizpurua, Sukhbinder Singh, Rom\'an Or\'us

PDF

Open Access

TL;DR

This paper demonstrates that in large language models, preserving the relative rank of weights is crucial for maintaining performance, and proposes a simple, training-free weight clustering method that compresses models effectively while highlighting the importance of rank preservation for robustness.

Contribution

The paper introduces a novel rank-based perspective for understanding LLMs, showing that weight rank preservation is key for robustness and proposing a simple clustering-based compression method that requires no retraining.

Findings

01

Weight clustering preserves accuracy with 16-64 shared values.

02

Rank scrambling degrades model quality sharply.

03

Affine correction delays scale drift and maintains rank order.

Abstract

Large language models (LLMs) contain billions of parameters, yet many exact values are not essential. We show that what matters most is the relative rank of weights-whether one connection is stronger or weaker than another-rather than precise magnitudes. To reduce the number of unique weight values, we apply weight clustering to pretrained models, replacing every weight matrix with K shared values from K-means. For Llama 3.1-8B-Instruct and SmolLM2-135M, reducing each matrix to only 16-64 distinct values preserves strong accuracy without retraining, providing a simple, training-free method to compress LLMs on disk. Optionally fine-tuning only the cluster means (centroids) recovers 30-40 percent of the remaining accuracy gap at minimal cost. We then systematically randomize cluster means while keeping assignments fixed. Scrambling the relative ranks of the clusters degrades quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)