Embedding Compression via Spherical Coordinates

Han Xiao

arXiv:2602.00079·cs.LG·March 27, 2026

Embedding Compression via Spherical Coordinates

Han Xiao

PDF

Open Access

TL;DR

This paper introduces a novel embedding compression technique using spherical coordinates that achieves 1.5x compression with minimal loss, outperforming previous lossless methods across various data types.

Contribution

The method leverages spherical coordinate properties of high-dimensional vectors to enable efficient entropy coding, achieving superior compression with negligible reconstruction error.

Findings

01

Achieves 1.5x compression over prior methods.

02

Maintains zero measurable retrieval degradation.

03

Consistent performance across diverse embedding types.

Abstract

We present an $ϵ$ -bounded compression method for unit-norm embeddings that achieves 1.5 $\times$ compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around $π /2$ , causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is bounded by float32 machine epsilon ( $1.19 \times 1 0^{- 7}$ ), making reconstructed values indistinguishable from originals at float32 precision. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent compression improvement with zero measurable retrieval degradation on BEIR benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Algorithms and Data Compression · Video Coding and Compression Technologies