Matrix Compression via Randomized Low Rank and Low Precision   Factorization

Rajarshi Saha; Varun Srivastava; Mert Pilanci

arXiv:2310.11028·cs.LG·October 18, 2023·5 cites

Matrix Compression via Randomized Low Rank and Low Precision Factorization

Rajarshi Saha, Varun Srivastava, Mert Pilanci

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a randomized low rank and low precision matrix factorization algorithm that effectively compresses large matrices, enabling significant storage reduction while maintaining or improving task performance.

Contribution

The paper presents a novel algorithm combining randomized sketching and quantization for low rank matrix approximation and compression, with theoretical error bounds and practical applications.

Findings

01

Achieves compression ratios as low as one bit per matrix element.

02

Maintains or surpasses performance of traditional methods in image and text tasks.

03

Effectively compresses large models like LlaMa-7b layers.

Abstract

Matrices are exceptionally useful in various fields of study as they provide a convenient framework to organize and manipulate data in a structured manner. However, modern matrices can involve billions of elements, making their storage and processing quite demanding in terms of computational resources and memory usage. Although prohibitively large, such matrices are often approximately low rank. We propose an algorithm that exploits this structure to obtain a low rank decomposition of any matrix $A$ as $A \approx LR$ , where $L$ and $R$ are the low rank factors. The total number of elements in $L$ and $R$ can be significantly less than that in $A$ . Furthermore, the entries of $L$ and $R$ are quantized to low precision formats $- -$ compressing $A$ by giving us a low rank and low…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pilancilab/matrix-compressor
pytorchOfficial

Videos

Matrix Compression via Randomized Low Rank and Low Precision Factorization· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Advanced Image and Video Retrieval Techniques · Machine Learning and Algorithms