Context-Based Visual-Language Place Recognition

Soojin Woo; Seong-Woo Kim

arXiv:2410.19341·cs.RO·October 28, 2024

Context-Based Visual-Language Place Recognition

Soojin Woo, Seong-Woo Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces a zero-shot, language-driven semantic segmentation method for visual place recognition that is robust to scene changes and does not require additional training, outperforming existing techniques.

Contribution

The authors propose a novel VPR approach using pixel-level embeddings from a zero-shot semantic segmentation model, eliminating the need for training and improving robustness to scene variations.

Findings

01

Outperforms non-learned image representations

02

Outperforms off-the-shelf CNN descriptors

03

Effective in challenging real-world scenarios

Abstract

In vision-based robot localization and SLAM, Visual Place Recognition (VPR) is essential. This paper addresses the problem of VPR, which involves accurately recognizing the location corresponding to a given query image. A popular approach to vision-based place recognition relies on low-level visual features. Despite significant progress in recent years, place recognition based on low-level visual features is challenging when there are changes in scene appearance. To address this, end-to-end training approaches have been proposed to overcome the limitations of hand-crafted features. However, these approaches still fail under drastic changes and require large amounts of labeled data to train models, presenting a significant limitation. Methods that leverage high-level semantic information, such as objects or categories, have been proposed to handle variations in appearance. In this paper,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

woo-soojin/context-based-vlpr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Speech and dialogue systems