PFO: A Parallel Friendly High Performance System for Online Query and Update of Nearest Neighbors
Nan Zhu, Wenbo He, Xue Liu, Yu Hua

TL;DR
PFO is a high-performance, parallel-friendly LSH system designed for real-time online nearest neighbor search, offering scalable capacity, reduced latency, and improved neighbor quality for big data applications.
Contribution
It introduces a novel parallel-friendly indexing structure for LSH that handles real-time queries and updates efficiently, scalable with flash memory integration.
Findings
Shorter latency compared to existing LSH systems
Higher throughput in online query/update scenarios
Better neighbor quality in streaming applications
Abstract
Nearest Neighbor(s) search is the fundamental computational primitive to tackle massive dataset. Locality Sensitive Hashing (LSH) has been a bracing tool for Nearest Neighbor(s) search in high dimensional spaces. However, traditional LSH systems cannot be applied in online big data systems to handle a large volume of query/update requests, because most of the systems optimize the query efficiency with the assumption of infrequent updates and missing the parallel-friendly design. As a result, the state-of-the-art LSH systems cannot adapt the system response to the user behavior interactively. In this paper, we propose a new LSH system called PFO. It handles query/update requests in RAM and scales the system capacity by using flash memory. To achieve high streaming data throughput, PFO adopts a parallel-friendly indexing structure while preserving the distance between data points.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Caching and Content Delivery · Algorithms and Data Compression
