MCGI: Manifold-Consistent Graph Indexing for Billion-Scale Disk-Resident Vector Search
Dongfang Zhao

TL;DR
MCGI introduces a geometry-aware graph indexing method for billion-scale vector search that adapts to data manifold structures, improving robustness and performance over traditional approaches.
Contribution
The paper proposes MCGI, a novel manifold-consistent graph index that dynamically adjusts search strategies based on local intrinsic dimensionality, reducing hyperparameter sensitivity.
Findings
MCGI outperforms five industry-standard baselines on five datasets.
It maintains stable performance across datasets of varying dimensionality.
Theoretical analysis confirms manifold-preserving topological connectivity.
Abstract
Graph-based Approximate Nearest Neighbor (ANN) search often suffers from performance degradation in high-dimensional spaces due to the Euclidean-Geodesic mismatch, where greedy routing diverges from the underlying data manifold. To address this challenge, this paper presents Manifold-Consistent Graph Indexing (MCGI), a geometry-aware and disk-resident indexing method that leverages Local Intrinsic Dimensionality (LID) to dynamically adapt search strategies to the intrinsic geometry of data. Unlike conventional algorithms that treat dimensions uniformly, MCGI modulates its beam search budget based on in-situ geometric analysis, which reduces sensitivity to data-specific hyperparameters by replacing a single scalar with a geometry-informed range that remains stable across datasets of varying dimensionality. Theoretical analysis demonstrates that MCGI provides robust approximation by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
