Efficient $1$-bit tensor approximations

Alex W. Neal Riasanovsky; Sarah El Kazdadi

arXiv:2410.01799·math.CO·October 3, 2024

Efficient $1$-bit tensor approximations

Alex W. Neal Riasanovsky, Sarah El Kazdadi

PDF

Open Access

TL;DR

This paper introduces a highly efficient method for approximating matrices and tensors using 1-bit valued vectors, enabling significant spatial compression with minimal error, and demonstrates its application to compressing large language model weights and images.

Contribution

The paper presents a novel 1-bit tensor decomposition technique that is simple, fast, and memory-efficient, extending previous work to tensors and practical large-scale applications.

Findings

01

Achieved 50% spatial compression of Mistral-7B model weights with less than 6% error.

02

The decomposition algorithm is simple, requiring only 20 lines of pseudocode.

03

Open source implementation optimized with SIMD instructions.

Abstract

We present a spatially efficient decomposition of matrices and arbitrary-order tensors as linear combinations of tensor products of ${- 1, 1}$ -valued vectors. For any matrix $A \in R^{m \times n}$ , $A - R_{w} = S_{w} C_{w} T_{w}^{⊤} = j = 1 \sum w c_{j} \cdot s_{j} t_{j}^{⊤}$ is a {\it $w$ -width signed cut decomposition of $A$ }. Here $C_{w} = " d ia g " (c_{w})$ for some $c_{w} \in R^{w},$ and $S_{w}, T_{w}$ , and the vectors $s_{j}, t_{j}$ are ${- 1, 1}$ -valued. To store $(S_{w}, T_{w}, C_{w})$ , we may pack $w \cdot (m + n)$ bits, and require only $w$ floating point numbers. As a function of $w$ , $∥ R_{w} ∥_{F}$ exhibits exponential decay when applied to #f32 matrices with i.i.d. $N (0, 1)$ entries. Choosing $w$ so that $(S_{w}, T_{w}, C_{w})$ has the same memory footprint as a \textit{f16} or \textit{bf16} matrix, the relative error is comparable.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Computational Physics and Python Applications

MethodsExponential Decay