Dual-Branch HNSW Approach with Skip Bridges and LID-Driven Optimization
Hy Nguyen, Nguyen Hung Nguyen, Nguyen Linh Bao Nguyen, Srikanth, Thudumu, Hung Du, Rajesh Vasa, Kon Mouzakis

TL;DR
This paper introduces a dual-branch HNSW algorithm with LID-driven insertion and skip bridges to improve approximate nearest neighbor search accuracy and speed, especially in high-dimensional datasets, without sacrificing inference efficiency.
Contribution
The paper presents a novel dual-branch HNSW structure with LID-based insertion and bridge-building techniques, addressing local optima and cluster disconnections in existing HNSW algorithms.
Findings
Achieved up to 30% recall improvement in CV tasks
Reduced construction time by up to 20%
Maintained inference speed without trade-offs
Abstract
The Hierarchical Navigable Small World (HNSW) algorithm is widely used for approximate nearest neighbor (ANN) search, leveraging the principles of navigable small-world graphs. However, it faces some limitations. The first is the local optima problem, which arises from the algorithm's greedy search strategy, selecting neighbors based solely on proximity at each step. This often leads to cluster disconnections. The second limitation is that HNSW frequently fails to achieve logarithmic complexity, particularly in high-dimensional datasets, due to the exhaustive traversal through each layer. To address these limitations, we propose a novel algorithm that mitigates local optima and cluster disconnections while enhancing the construction speed, maintaining inference speed. The first component is a dual-branch HNSW structure with LID-based insertion mechanisms, enabling traversal from…
Peer Reviews
Decision·Submitted to ICLR 2025
- Work on an important problem for practical application.
- I implemented HSNW from scratch and published HSNW-related algorithm in top data mining conferences before. The experimental setup is completely nonsense to me. It should sort of follow-up the ANN-benchmark setup and that's a more reasonable way to show results. - Apparently, the algorithm doesn't compare to the real HSNW implementation in wall clock time, and only compare their own variations. - The algorithm seems to be implemented in Python only, which is sort of contradicting to the po
1. The dual-branch structure and LID-based insertion mechanism are well-motivated and novel. 1. The experiments have (partially) shown the effectiveness of the proposed method.
My major concerns are about experiments. 1. The proposed method is implemented in Python, which is not a good programming language model for comparing speed. The authors may provide a theoretical time complexity analysis to complement their empirical results. This would allow a fairer comparison of the algorithm's efficiency across different implementations. 1. The baselines are relatively weak, I did not see the advanced methods (e.g., IVFPQ) used in faiss[1]. I'd like to have authors justify
1)This paper presents a novel enhancement to the HNSW algorithm, addressing key limitations related to local optima and inference speed. 2)The research is evaluated across diverse benchmarks, including datasets from Computer Vision (CV), Deep Learning (DL), and Natural Language Processing (NLP). The experiments clearly support the proposed method's superiority in both accuracy and speed, with substantial performance gains. 3)The paper is well-written, with clear explanations of complex concepts
1)The related work section is overly concise, lacking a comprehensive review of current research, which limits contextual understanding of the contributions. 2)Although the paper claims improvements in inference speed, Figures 9 and 10 show only modest gains, casting doubt on the practical significance of this claim. 3)The experimental evaluations are relatively limited, with insufficient algorithmic comparisons; in particular, using only the GLOVE dataset for NLP benchmarks diminishes the persu
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElectromagnetic Launch and Propulsion Technology · Electrical Contact Performance and Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
