Onyx: Cost-Efficient Disk-Oblivious ANN Search
Deevashwer Rathee, Jean-Luc Watson, Zirui Neil Zhao, G. Edward Suh, Raluca Ada Popa

TL;DR
Onyx is a novel system that significantly reduces cost and latency in disk-oblivious approximate nearest neighbor search by co-designing components to optimize bandwidth and access count.
Contribution
It introduces a new approach that inverts traditional design trade-offs, with two components: Onyx-ANNS for bandwidth-efficient pruning and Onyx-ORAM for access count reduction.
Findings
Onyx achieves 1.7-9.9x lower cost than prior systems.
Onyx reduces latency by 2.3-12.3x compared to state-of-the-art.
The system maintains recall while optimizing resource usage.
Abstract
Approximate nearest neighbor (ANN) search in AI systems increasingly handles sensitive data on third-party infrastructure. Trusted execution environments (TEEs) offer protection, but cost-efficient deployments must rely on external SSDs, which leaks user queries through disk access patterns to the host. Oblivious RAM (ORAM) can hide these access patterns but at a high cost; when paired with existing disk-based ANN search techniques, it makes poor use of SSD resources, yielding high latency and poor cost-efficiency. The core challenge for efficient oblivious ANN search over SSDs is balancing both bandwidth and access count. The state-of-the-art ORAM-ANN design minimizes access count at the ANN level and bandwidth at the ORAM level, each trading-off the other, leaving the combined system with both resources overutilized. We propose inverting this design, minimizing bandwidth consumption…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
