On Efficient Approximate Aggregate Nearest Neighbor Queries over Learned Representations
Carrie Wang, Sihem Amer-Yahia, Laks V. S. Lakshmanan, Reynold Cheng

TL;DR
This paper introduces SPRinT, a framework for efficient and accurate approximate aggregate nearest neighbor queries over learned representations, combining high- and low-quality data to handle high-cost representations and sensitive aggregation functions.
Contribution
The paper proposes SPRinT, a novel query answering framework that combines sampling, neighbor selection, and aggregation, with theoretical bounds and extensive empirical validation.
Findings
SPRinT achieves lower aggregation error than existing methods.
SPRinT maintains stable performance as dataset size increases.
SPRinT is scalable for large-scale applications with high accuracy and efficiency.
Abstract
We study Aggregation Queries over Nearest Neighbors (AQNN), which compute aggregates over the learned representations of the neighborhood of a designated query object. For example, a medical professional may be interested in the average heart rate of patients whose representations are similar to that of an insomnia patient. Answering AQNNs accurately and efficiently is challenging due to the high cost of generating high-quality representations (e.g., via a deep learning model trained on human expert annotations) and the different sensitivities of different aggregation functions to neighbor selection errors. We address these challenges by combining high-quality and low-cost representations to approximate the aggregate. We characterize value- and count-sensitive AQNNs and propose the Sampler with Precision-Recall in Target (SPRinT), a query answering framework that works in three steps:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Healthcare · Topic Modeling
