Low-Rank Tensor Approximation of Weights in Large Language Models via Cosine Lanczos Bidiagonalization
A. El Ichi, K. Jbilou

TL;DR
This paper proposes a tensor compression method using the cproduct and Cosine Lanczos Bidiagonalization to efficiently approximate large language model weights, reducing memory and computation costs.
Contribution
It introduces a novel tensor approximation framework leveraging the algebraic structure of the cproduct for efficient compression of LLM weights.
Findings
Enables low-rank approximation of LLM weight tensors
Exploits multidimensional correlations beyond SVD
Achieves computationally efficient compression
Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language tasks but suffer from extremely large memory footprints and computational costs. In this paper, we introduce a tensor compression framework based on the cproduct for computing low rank approximation In the first part of our approach, we leverage the algebraic structure of the cproduct to represent weight tensors such as those in embedding layers, attention projections, and feed forward networks in a transform domain where frontal slices can be jointly approximated by low rank tensor factors. This enables computationally efficient compression that exploits multidimensional correlations beyond traditional SVD methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Topic Modeling · Generative Adversarial Networks and Image Synthesis
