Only relative ranks matter in weight-clustered large language models
Borja Aizpurua, Sukhbinder Singh, Rom\'an Or\'us

TL;DR
This paper demonstrates that in large language models, preserving the relative rank of weights is crucial for maintaining performance, and proposes a simple, training-free weight clustering method that compresses models effectively while highlighting the importance of rank preservation for robustness.
Contribution
The paper introduces a novel rank-based perspective for understanding LLMs, showing that weight rank preservation is key for robustness and proposing a simple clustering-based compression method that requires no retraining.
Findings
Weight clustering preserves accuracy with 16-64 shared values.
Rank scrambling degrades model quality sharply.
Affine correction delays scale drift and maintains rank order.
Abstract
Large language models (LLMs) contain billions of parameters, yet many exact values are not essential. We show that what matters most is the relative rank of weights-whether one connection is stronger or weaker than another-rather than precise magnitudes. To reduce the number of unique weight values, we apply weight clustering to pretrained models, replacing every weight matrix with K shared values from K-means. For Llama 3.1-8B-Instruct and SmolLM2-135M, reducing each matrix to only 16-64 distinct values preserves strong accuracy without retraining, providing a simple, training-free method to compress LLMs on disk. Optionally fine-tuning only the cluster means (centroids) recovers 30-40 percent of the remaining accuracy gap at minimal cost. We then systematically randomize cluster means while keeping assignments fixed. Scrambling the relative ranks of the clusters degrades quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
