Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search

Stephan S. Lorenzen; Ninh Pham

arXiv:1908.08656·cs.DB·September 15, 2020

Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search

Stephan S. Lorenzen, Ninh Pham

PDF

TL;DR

This paper improves budgeted top-k maximum inner product search by demonstrating wedge sampling's efficiency and accuracy advantages and introducing a wedge-based algorithm that outperforms existing methods.

Contribution

It shows wedge sampling's superiority over diamond sampling for budgeted top-k MIPS and introduces a fast wedge-based algorithm with high precision.

Findings

01

Wedge sampling often outperforms diamond sampling in efficiency and accuracy.

02

The proposed wedge-based algorithm is significantly faster than state-of-the-art methods.

03

The algorithm maintains at least 80% top-5 precision on standard datasets.

Abstract

Top-k maximum inner product search (MIPS) is a central task in many machine learning applications. This paper extends top-k MIPS with a budgeted setting, that asks for the best approximate top-k MIPS given a limit of B computational operations. We investigate recent advanced sampling algorithms, including wedge and diamond sampling to solve it. Though the design of these sampling schemes naturally supports budgeted top-k MIPS, they suffer from the linear cost from scanning all data points to retrieve top-k results and the performance degradation for handling negative inputs. This paper makes two main contributions. First, we show that diamond sampling is essentially a combination between wedge sampling and basic sampling for top-k MIPS. Our theoretical analysis and empirical evaluation show that wedge is competitive (often superior) to diamond on approximating top-k MIPS regarding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.