Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models

Ryan Solgi; Kai Zhen; Rupak Vignesh Swaminathan; Nathan Susanj; Athanasios Mouchtaris; Siegfried Kunzmann; Zheng Zhang

arXiv:2505.14871·cs.CL·October 14, 2025

Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models

Ryan Solgi, Kai Zhen, Rupak Vignesh Swaminathan, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces Saten, a novel sparse augmented tensor network method that significantly improves post-training compression and accuracy of large language models, enabling efficient deployment on resource-limited devices.

Contribution

Saten is a new framework that enhances tensorized LLMs with sparsity, allowing full model compression and improved performance without access to pretraining data.

Findings

01

Saten achieves state-of-the-art compression and accuracy.

02

Saten enhances tensorized language models during fine-tuning.

03

Experimental results validate the effectiveness of Saten.

Abstract

The efficient implementation of large language models (LLMs) is crucial for deployment on resource-constrained devices. Low-rank tensor compression techniques, such as tensor-train (TT) networks, have been widely studied for over-parameterized neural networks. However, their applications to compress pre-trained large language models (LLMs) for downstream tasks (post-training) remains challenging due to the high-rank nature of pre-trained LLMs and the lack of access to pretraining data. In this study, we investigate low-rank tensorized LLMs during fine-tuning and propose sparse augmented tensor networks (Saten) to enhance their performance. The proposed Saten framework enables full model compression. Experimental results demonstrate that Saten enhances both accuracy and compression efficiency in tensorized language models, achieving state-of-the-art performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Tensor decomposition and applications · Advanced Neural Network Applications