WebANNS: Fast and Efficient Approximate Nearest Neighbor Search in Web Browsers
Mugeng Liu, Siqi Zhong, Qi Yang, Yudong Han, Xuanzhe Liu, and Yun Ma

TL;DR
WebANNS is a novel in-browser approximate nearest neighbor search engine that significantly improves speed and memory efficiency for web applications, enabling practical AI retrieval tasks directly in browsers.
Contribution
WebANNS introduces a WebAssembly-based engine with lazy loading and heuristic memory reduction, addressing computational, storage, and memory challenges unique to web browsers.
Findings
Achieves up to 743.8x faster query latency than SOTA
Reduces memory usage by up to 39%
Decreases query time from 10 seconds to 10 milliseconds
Abstract
Approximate nearest neighbor search (ANNS) has become vital to modern AI infrastructure, particularly in retrieval-augmented generation (RAG) applications. Numerous in-browser ANNS engines have emerged to seamlessly integrate with popular LLM-based web applications, while addressing privacy protection and challenges of heterogeneous device deployments. However, web browsers present unique challenges for ANNS, including computational limitations, external storage access issues, and memory utilization constraints, which state-of-the-art (SOTA) solutions fail to address comprehensively. We propose WebANNS, a novel ANNS engine specifically designed for web browsers. WebANNS leverages WebAssembly to overcome computational bottlenecks, designs a lazy loading strategy to optimize data retrieval from external storage, and applies a heuristic approach to reduce memory usage. Experiments show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
