Low-Rank Tensor Approximation of Weights in Large Language Models via Cosine Lanczos Bidiagonalization

A. El Ichi; K. Jbilou

arXiv:2601.17112·cs.LG·January 27, 2026

Low-Rank Tensor Approximation of Weights in Large Language Models via Cosine Lanczos Bidiagonalization

A. El Ichi, K. Jbilou

PDF

Open Access

TL;DR

This paper proposes a tensor compression method using the cproduct and Cosine Lanczos Bidiagonalization to efficiently approximate large language model weights, reducing memory and computation costs.

Contribution

It introduces a novel tensor approximation framework leveraging the algebraic structure of the cproduct for efficient compression of LLM weights.

Findings

01

Enables low-rank approximation of LLM weight tensors

02

Exploits multidimensional correlations beyond SVD

03

Achieves computationally efficient compression

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language tasks but suffer from extremely large memory footprints and computational costs. In this paper, we introduce a tensor compression framework based on the cproduct for computing low rank approximation In the first part of our approach, we leverage the algebraic structure of the cproduct to represent weight tensors such as those in embedding layers, attention projections, and feed forward networks in a transform domain where frontal slices can be jointly approximated by low rank tensor factors. This enables computationally efficient compression that exploits multidimensional correlations beyond traditional SVD methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Topic Modeling · Generative Adversarial Networks and Image Synthesis