Solving $k$-Nearest Neighbor Problem on Multiple Graphics Processors
Kimikazu Kato, Tikara Hosino

TL;DR
This paper presents a GPU-based algorithm for efficiently solving the $k$-nearest neighbor problem, crucial in recommendation systems, demonstrating significant speedup and scalability on multiple GPUs.
Contribution
The paper introduces a novel GPU algorithm combining an $N$-body problem approach and a partial sort optimized for small $k$, achieving high bandwidth utilization and scalability.
Findings
Over 330x faster on two GPUs compared to CPU implementation.
Effective scaling with increasing number of GPUs.
High bandwidth utilization through coalesced memory access.
Abstract
The recommendation system is a software system to predict customers' unknown preferences from known preferences. In the recommendation system, customers' preferences are encoded into vectors, and finding the nearest vectors to each vector is an essential part. This vector-searching part of the problem is called a -nearest neighbor problem. We give an effective algorithm to solve this problem on multiple graphics processor units (GPUs). Our algorithm consists of two parts: an -body problem and a partial sort. For a algorithm of the -body problem, we applied the idea of a known algorithm for the -body problem in physics, although another trick is need to overcome the problem of small sized shared memory. For the partial sort, we give a novel GPU algorithm which is effective for small . In our partial sort algorithm, a heap is accessed in parallel by threads with a low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Complexity and Algorithms in Graphs · Algorithms and Data Compression
