Sum Estimation via Vector Similarity Search
Stephen Mussmann, Mehul Smriti Raje, Kavya Tumkur, Oumayma Messoussi, Cyprien Hachem, Seby Jacob

TL;DR
This paper introduces a novel sum estimation algorithm using vector similarity search that significantly reduces the number of vectors needed from (\u221a n) to (\u2212log n), improving efficiency in machine learning applications.
Contribution
The authors develop a new algorithm for sum estimation that requires only (( )) vector similarities, outperforming existing methods in accuracy and computational efficiency.
Findings
Achieves lower error with less computation than existing methods
Effective in estimating densities and softmax denominators
Demonstrates practical improvements on OpenImages and Amazon Reviews datasets
Abstract
Semantic embeddings to represent objects such as image, text and audio are widely used in machine learning and have spurred the development of vector similarity search methods for retrieving semantically related objects. In this work, we study the sibling task of estimating a sum over all objects in a set, such as the kernel density estimate (KDE) and the normalizing constant for softmax distributions. While existing solutions provably reduce the sum estimation task to acquiring most similar vectors, where is the number of objects, we introduce a novel algorithm that only requires most similar vectors. Our approach randomly assigns objects to levels with exponentially-decaying probabilities and constructs a vector similarity search data structure for each level. With the top- objects from each level, we propose an unbiased estimate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Topic Modeling
