Enhancing Landmark Detection in Cluttered Real-World Scenarios with   Vision Transformers

Mohammad Javad Rajabi; Morteza Mirzai; Ahmad Nickabadi

arXiv:2308.13671·cs.CV·August 29, 2023

Enhancing Landmark Detection in Cluttered Real-World Scenarios with Vision Transformers

Mohammad Javad Rajabi, Morteza Mirzai, Ahmad Nickabadi

PDF

Open Access

TL;DR

This paper introduces a novel vision transformer-based method that improves landmark detection in cluttered real-world scenes by isolating relevant patches and filtering out occluding objects, leading to higher accuracy.

Contribution

It presents a new approach that uses a selection process within vision transformers to enhance landmark detection amidst cluttered environments.

Findings

01

Achieved superior accuracy on augmented datasets

02

Effectively isolates relevant image patches

03

Demonstrates potential of transformers in cluttered scenarios

Abstract

Visual place recognition tasks often encounter significant challenges in landmark detection due to the presence of irrelevant objects such as humans, cars, and trees, despite the remarkable progress achieved by previous models, especially in the context of transformers. To address this issue, we propose a novel method that effectively leverages the strengths of vision transformers. By employing a meticulous selection process, our approach identifies and isolates specific patches within the image that correspond to occluding objects. To evaluate the efficacy of our method, we created augmented datasets and conducted comprehensive testing. The results demonstrate the superior accuracy achieved by our proposed approach. This research contributes to the advancement of landmark detection in visual place recognition and shows the potential of leveraging vision transformers to overcome…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Advanced Neural Network Applications