OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries
Shikhar Jaiswal, Ravishankar Krishnaswamy, Ankit Garg, Harsha Vardhan, Simhadri, Sheshansh Agrawal

TL;DR
This paper introduces OOD-DiskANN, a scalable graph-based ANNS algorithm that efficiently handles out-of-distribution queries by leveraging a small sample of such queries, significantly improving latency.
Contribution
The work presents a novel OOD-aware ANNS method that uses limited OOD query samples to enhance search efficiency for out-of-distribution data.
Findings
Up to 40% reduction in mean query latency for OOD queries.
Scalable graph-based index with OOD query adaptation.
Improves efficiency for both OOD and ID queries.
Abstract
State-of-the-art algorithms for Approximate Nearest Neighbor Search (ANNS) such as DiskANN, FAISS-IVF, and HNSW build data dependent indices that offer substantially better accuracy and search efficiency over data-agnostic indices by overfitting to the index data distribution. When the query data is drawn from a different distribution - e.g., when index represents image embeddings and query represents textual embeddings - such algorithms lose much of this performance advantage. On a variety of datasets, for a fixed recall target, latency is worse by an order of magnitude or more for Out-Of-Distribution (OOD) queries as compared to In-Distribution (ID) queries. The question we address in this work is whether ANNS algorithms can be made efficient for OOD queries if the index construction is given access to a small sample set of these queries. We answer positively by presenting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Automated Road and Building Extraction
