Infinity Search: Approximate Vector Search with Projections on q-Metric Spaces
Antonio Pariente, Ignacio Hounie, Santiago Segarra, Alejandro Ribeiro

TL;DR
This paper introduces a novel approximate vector search method using projections onto q-metric and ultrametric spaces, leveraging learned approximations to balance search speed and accuracy.
Contribution
It proposes a new approach to approximate vector search by transforming dissimilarity functions into ultrametric spaces via learned projections, enabling efficient search.
Findings
Increasing q improves search speed but reduces recall.
The method is competitive with existing vector search techniques.
Learned projections effectively approximate ultrametric distances.
Abstract
An ultrametric space or infinity-metric space is defined by a dissimilarity function that satisfies a strong triangle inequality in which every side of a triangle is not larger than the larger of the other two. We show that search in ultrametric spaces with a vantage point tree has worst-case complexity equal to the depth of the tree. Since datasets of interest are not ultrametric in general, we employ a projection operator that transforms an arbitrary dissimilarity function into an ultrametric space while preserving nearest neighbors. We further learn an approximation of this projection operator to efficiently compute ultrametric distances between query points and points in the dataset. We proceed to solve a more general problem in which we consider projections in -metric spaces -- in which triangle sides raised to the power of are smaller than the sum of the -powers of the…
Peer Reviews
Decision·ICLR 2026 Conference Desk Rejected Submission
The paper explores a very interesting direction, extending nearest neighbor search from spaces with standard metrics to those with stronger triangle inequalities. For example, nearest neighbor search in ultrametric spaces seem to be particularly nice, as ultrametric balls are either nested or disjoint. The ideas are natural and simple, and the algorithms may be nice contributions to practical, approximate nearest neighbor search.
- Lemma 1 is not correct. The proof only shows that both conditions are not true at the same time. But, they can both be false at the same time. For example, let $u = v$ and $\mu < d(x_o, u)$. - Algorithm 3 is also not correct. Construct the following with six data points with two very salient clusters: {A, B, C} and {U, V, W}. Let distances within each cluster be 1 and distances between clusters be 100. Let the query point Q be at distance 10 from points in {A, B, C} and distance 100 from point
- Introduces a novel framework that generalizes ANN search to arbitrary dissimilarity measures through mathematically grounded projections onto $q$-metric spaces. - Provides strong theoretical guarantees on distance preservation and query complexity. - Demonstrates competitive empirical performance against HNSW and ScaNN, showing both scalability and adaptability through a learnable projection mechanism.
- Infinity Search relies heavily on learning accurate projections into $q$-metric spaces, but the paper does not analyze how projection errors affect theoretical guarantees or retrieval accuracy. - The computational complexity of training and applying the learned projection is high, and scalability to billion-scale or streaming datasets remains unclear. - The approach assumes the projected space preserves neighborhood structure globally, yet real data often violate the $q$-metric inequalities
1. Compared with traditional hash-based or graph-based algorithms for ANN, the idea of mapping points into a q-metric space sounds very novel and worth exploring. 2. The authors provide a theoretical guarantee for their search algorithm as q approaches infinity.
1. I am curious how long it takes to calculate Dq for any pair of points from the dataset X. I didn’t find any reported time for training the MLP model or building the index. Appendix D.3 states that they can speed it up from O(n^3) to O(nkl), but I don’t quite understand this, since the output distance matrix would still be at least O(n^2). This seems impractical for any dataset larger than one million points. 2. The authors prove that under the q-metric space, the original nearest neighbor is
+ The paper studies a meaningful problem. + Experiments are conducted on benchmark datasets. + The proposed solution is supported by some theoretical analysis.
+ The motivation is unconvincing. + The technical novelty is limited. + The proposed solution has several (potential) limitations. + The evaluations are incomplete. + The related work is not comprehensive enough and should not be deferred to the appendix.
1. An interesting idea. 2. Strong **reported** results. 3. Code is provided.
Please see the summary. Basically, I do not think the code is doing what it is supposed to do and the evaluation is likely incorrect. **Detailed comments:** L031 Hanov does not seem to be an appropriate reference for this statement. These are more appropriate classic references: 1. Kevin S. Beyer, Jonathan Goldstein, Raghu Ramakrishnan, and Uri Shaft. When is ”nearest neighbor” meaningful? 1999. Same mistake in L738: the classic VP-tree references: * Jeffrey K. Uhlmann. 1999. Satisf
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Topological and Geometric Data Analysis · Advanced Image and Video Retrieval Techniques
