Low-Precision Quantization for Efficient Nearest Neighbor Search
Anthony Ko, Iman Keivanloo, Vihan Lakshman, Eric Schkufza

TL;DR
This paper introduces a low-precision quantization method for vectors that reduces memory usage and speeds up nearest neighbor search, with minimal impact on accuracy, applicable across various KNN frameworks.
Contribution
It proposes a novel quantization approach focused on preserving distance metric behavior rather than reconstruction error, enhancing efficiency in KNN searches.
Findings
Reduces memory by 60% using quantized vectors.
Increases query throughput by 30%.
Maintains 98% of original recall performance.
Abstract
Fast k-Nearest Neighbor search over real-valued vector spaces (KNN) is an important algorithmic task for information retrieval and recommendation systems. We present a method for using reduced precision to represent vectors through quantized integer values, enabling both a reduction in the memory overhead of indexing these vectors and faster distance computations at query time. While most traditional quantization techniques focus on minimizing the reconstruction error between a point and its uncompressed counterpart, we focus instead on preserving the behavior of the underlying distance metric. Furthermore, our quantization approach is applied at the implementation level and can be combined with existing KNN algorithms. Our experiments on both open source and proprietary datasets across multiple popular KNN frameworks validate that quantized distance metrics can reduce memory by 60% and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Machine Learning and Data Classification
