The Information Theory of Similarity

Nikit Phadke

arXiv:2512.00378·cs.IT·December 2, 2025

The Information Theory of Similarity

Nikit Phadke

PDF

Open Access

TL;DR

This paper unifies similarity search methods with information theory, showing that similarity measures relate to mutual information and that fundamental limits govern encoding efficiency and ranking preservation.

Contribution

It establishes a rigorous mathematical equivalence between similarity systems and Shannon's information theory, deriving fundamental bounds and revealing the physical nature of semantic similarity.

Findings

01

REWA's complexity bound is proven optimal.

02

Similarity search relates to mutual information and channel capacity.

03

Fundamental limits constrain encoding and ranking preservation.

Abstract

We establish a precise mathematical equivalence between witness-based similarity systems (REWA) and Shannon's information theory. We prove that witness overlap is mutual information, that REWA bit complexity bounds arise from channel capacity limitations, and that ranking-preserving encodings obey rate-distortion constraints. This unification reveals that fifty years of similarity search research -- from Bloom filters to locality-sensitive hashing to neural retrieval -- implicitly developed information theory for relational data. We derive fundamental lower bounds showing that REWA's $O (Δ^{- 2} lo g N)$ complexity is optimal: no encoding scheme can preserve similarity rankings with fewer bits. The framework establishes that semantic similarity has physical units (bits of mutual information), search is communication (query transmission over a noisy channel), and retrieval systems…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Ferroelectric and Negative Capacitance Devices · Stochastic Gradient Optimization Techniques