REWA: A General Theory of Witness-Based Similarity
Nikit Phadke

TL;DR
This paper introduces a universal framework for similarity-preserving encodings that unifies various methods under a single theoretical model, providing complexity bounds and explicit constructions.
Contribution
It formulates similarity as witness projection over monoids, unifying diverse methods and proving complexity bounds with explicit constants and constructions.
Findings
Unified framework for similarity methods including LSH, Bloom filters, and attention kernels.
Proved ( ext{log} N) complexity bounds for all major similarity methods.
Provided explicit constructions and tight concentration bounds for various algebraic structures.
Abstract
We present a universal framework for similarity-preserving encodings that subsumes all discrete, continuous, algebraic, and learned similarity methods under a single theoretical umbrella. By formulating similarity as functional witness projection over monoids, we prove that \[ O\!\left(\frac{1}{\Delta^{2}}\log N\right) \] encoding complexity with ranking preservation holds for arbitrary algebraic structures. This unification reveals that Bloom filters, Locality Sensitive Hashing (LSH), Count-Min sketches, Random Fourier Features, and Transformer attention kernels are instances of the same underlying mechanism. We provide complete proofs with explicit constants under 4-wise independent hashing, handle heavy-tailed witnesses via normalization and clipping, and prove \[ O(\log N) \] complexity for all major similarity methods from 1970-2024. We give explicit constructions for Boolean,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Graph Neural Networks · Multimodal Machine Learning Applications
