An Efficient Index for Visual Search in Appearance-based SLAM
Kiana Hajebi, Hong Zhang

TL;DR
This paper introduces a graph-based nearest neighbor search method to accelerate vector-quantization in appearance-based SLAM, enabling real-time performance with minimal additional computational cost.
Contribution
It presents a novel integration of GNNS into BoW-based SLAM, significantly improving speed over existing methods with efficient index construction.
Findings
GNNS outperforms state-of-the-art search methods in speed.
The k-NN graph can be built with minimal extra cost during vocabulary creation.
Sequential image acquisition allows for further speedup in GNNS search.
Abstract
Vector-quantization can be a computationally expensive step in visual bag-of-words (BoW) search when the vocabulary is large. A BoW-based appearance SLAM needs to tackle this problem for an efficient real-time operation. We propose an effective method to speed up the vector-quantization process in BoW-based visual SLAM. We employ a graph-based nearest neighbor search (GNNS) algorithm to this aim, and experimentally show that it can outperform the state-of-the-art. The graph-based search structure used in GNNS can efficiently be integrated into the BoW model and the SLAM framework. The graph-based index, which is a k-NN graph, is built over the vocabulary words and can be extracted from the BoW's vocabulary construction procedure, by adding one iteration to the k-means clustering, which adds small extra cost. Moreover, exploiting the fact that images acquired for appearance-based SLAM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · k-Nearest Neighbors
