Hierarchical Sparse Plus Low Rank Compression of LLM

Pawan Kumar; Aditi Gupta

arXiv:2601.07839·cs.LG·January 14, 2026

Hierarchical Sparse Plus Low Rank Compression of LLM

Pawan Kumar, Aditi Gupta

PDF

Open Access

TL;DR

This paper introduces Hierarchical Sparse Plus Low-Rank (HSS) compression for large language models, combining sparsity and low-rank factorization to reduce memory and computation while maintaining performance.

Contribution

The paper proposes a novel two-stage HSS compression method with recursive hierarchy and RCM permutation, improving efficiency and compressibility of LLMs.

Findings

01

HSS achieves significant memory savings on LLaMA-7B.

02

HSS maintains state-of-the-art perplexity scores.

03

HSS outperforms classical sparse-plus-SVD methods.

Abstract

Modern large language models (LLMs) place extraordinary pressure on memory and compute budgets, making principled compression indispensable for both deployment and continued training. We present Hierarchical Sparse Plus Low-Rank (HSS) compression, a two-stage scheme that (i) removes the largest-magnitude weights into a sparse matrix S and (ii) applies a recursive Hierarchically Sparse Separable (HSS) low-rank factorisation to the dense residual matrix. A recursive rank-reducing strategy and a reverse Cuthill-Mckee (RCM) permutation are introduced to align high weights towards the diagonal with the block-diagonal hierarchy, maximising off-diagonal compressibility (because they are touched only once). HSS is hardware-friendly: its matrix-vector multiply reduces to one sparse and a sequence of thin-matrix multiplications and can be trained end-to-end with standard optimisers. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Big Data and Digital Economy · Topic Modeling