# High Dimensional Similarity Search with Satellite System Graph:   Efficiency, Scalability, and Unindexed Query Compatibility

**Authors:** Cong Fu, Changxu Wang, Deng Cai

arXiv: 1907.06146 · 2021-03-19

## TL;DR

This paper introduces Satellite System Graphs (SSG), a novel graph-based index for high-dimensional approximate nearest neighbor search that improves efficiency, scalability, and supports unindexed queries with strong theoretical guarantees.

## Contribution

The paper proposes SSG and NSSG, new graph structures with effective omnidirectional connections, adjustable sparsity, and reduced indexing complexity, advancing high-dimensional ANNS methods.

## Key findings

- SSG outperforms existing algorithms in search accuracy and speed.
- NSSG significantly reduces indexing complexity for large datasets.
- Theoretical analysis confirms SSG's robustness for both indexed and unindexed queries.

## Abstract

Approximate Nearest Neighbor Search (ANNS) in high dimensional space is essential in database and information retrieval. Recently, there has been a surge of interest in exploring efficient graph-based indices for the ANNS problem. Among them, Navigating Spreading-out Graph (NSG) provides fine theoretical analysis and achieves state-of-the-art performance. However, we find there are several limitations with NSG: 1) NSG has no theoretical guarantee on nearest neighbor search when the query is not indexed in the database; 2) NSG is too sparse which harms the search performance. In addition, NSG suffers from high indexing complexity. To address the above problems, we propose the Satellite System Graphs (SSG) and a practical variant NSSG. Specifically, we propose a novel pruning strategy to produce SSGs from the complete graph. SSGs define a new family of MSNETs in which the out-edges of each node are distributed evenly in all directions. Each node in the graph builds effective connections to its neighborhood omnidirectionally, whereupon we derive SSG's excellent theoretical properties for both indexed and unindexed queries. We can adaptively adjust the sparsity of an SSG with a hyper-parameter to optimize the search performance. Further, NSSG is proposed to reduce the indexing complexity of the SSG for large-scale applications. Both theoretical and extensive experimental analyses are provided to demonstrate the strengths of the proposed approach over the existing representative algorithms. Our code has been released at https://github.com/ZJULearning/SSG.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06146/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06146/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/1907.06146/full.md

---
Source: https://tomesphere.com/paper/1907.06146