TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer,, Sanjiv Kumar

TL;DR
This paper introduces a new nearest neighbor search algorithm optimized for TPUs that achieves peak FLOP/s performance, surpassing GPU methods while maintaining high recall and supporting frequent updates.
Contribution
The paper presents a novel TPU-optimized KNN algorithm with an accurate performance model, analytical recall guarantees, and no need for complex indexing or tuning.
Findings
Achieves TPU peak FLOP/s performance in KNN search
Outperforms state-of-the-art GPU algorithms in recall and speed
Supports frequent updates without complex index maintenance
Abstract
This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. The design of the proposed algorithm is motivated by an accurate accelerator performance model that takes into account both the memory and instruction bottlenecks. Our algorithm comes with an analytical guarantee of recall in expectation and does not require maintaining sophisticated index data structure or tuning, making it suitable for applications with frequent updates. Our work is available in the open-source package of Jax and Tensorflow on TPU.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAlgorithms and Data Compression · Tensor decomposition and applications · Advanced Neural Network Applications
