3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs

Mehdi Makni; Xiang Meng; Rahul Mazumder

arXiv:2603.01376·cs.LG·March 3, 2026

3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs

Mehdi Makni, Xiang Meng, Rahul Mazumder

PDF

Open Access 1 Video

TL;DR

This paper introduces 3BASiL-TM, a novel one-shot post-training framework for decomposing large language models into sparse and low-rank components, significantly improving compression quality and efficiency.

Contribution

The paper proposes a new 3-Block ADMM method and a transformer-matching refinement for effective sparse plus low-rank decomposition of LLMs, with convergence guarantees and broad applicability.

Findings

01

Reduces WikiText2 perplexity gap by over 30%.

02

Achieves 2.5x faster compression runtime on GPU.

03

Outperforms prior methods in model compression quality.

Abstract

Sparse plus Low-Rank $(S + LR)$ decomposition of Large Language Models (LLMs) has emerged as a promising direction in model compression, aiming to decompose pre-trained model weights into a sum of sparse and low-rank matrices $(W \approx S + LR)$ . Despite recent progress, existing methods often suffer from substantial performance degradation compared to dense models. In this work, we introduce 3BASiL-TM, an efficient one-shot post-training method for $(S + LR)$ decomposition of LLMs that addresses this gap. Our approach first introduces a novel 3-Block Alternating Direction Method of Multipliers (ADMM) method, termed 3BASiL, to minimize the layer-wise reconstruction error with convergence guarantees. We then design an efficient transformer-matching (TM) refinement step that jointly optimizes the sparse and low-rank…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Tensor decomposition and applications · Generative Adversarial Networks and Image Synthesis