Diamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search
Grey Ballard, Ali Pinar, Tamara G. Kolda, C. Seshadhri

TL;DR
This paper introduces a diamond sampling method that efficiently approximates the top dot products between two vector sets by sampling four-cycle structures, significantly reducing computation time and improving accuracy over existing methods.
Contribution
The paper presents a novel diamond sampling technique that accelerates approximate maximum all-pairs dot-product search by focusing on high-magnitude pairs through a probabilistic sampling approach.
Findings
Diamond sampling is orders of magnitude faster than direct computation.
Requires fewer samples than competing methods for similar accuracy.
Achieves better results than state-of-the-art hashing in maximum inner product search.
Abstract
Given two sets of vectors, and , our problem is to find the top- dot products, i.e., the largest among all possible pairs. This is a fundamental mathematical problem that appears in numerous data applications involving similarity search, link prediction, and collaborative filtering. We propose a sampling-based approach that avoids direct computation of all dot products. We select diamonds (i.e., four-cycles) from the weighted tripartite representation of and . The probability of selecting a diamond corresponding to pair is proportional to , amplifying the focus on the largest-magnitude entries. Experimental results indicate that diamond sampling is orders of magnitude faster than direct computation and requires far fewer samples than any competing approach. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
