Towards Global Localization using Multi-Modal Object-Instance   Re-Identification

Aneesh Chavan; Vaibhav Agrawal; Vineeth Bhat; Sarthak Chittawar,; Siddharth Srivastava; Chetan Arora; K Madhava Krishna

arXiv:2409.12002·cs.RO·May 2, 2025

Towards Global Localization using Multi-Modal Object-Instance Re-Identification

Aneesh Chavan, Vaibhav Agrawal, Vineeth Bhat, Sarthak Chittawar,, Siddharth Srivastava, Chetan Arora, K Madhava Krishna

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel multimodal transformer architecture for robust object-instance re-identification using RGB and depth data, significantly improving localization and perception in cluttered and varying illumination scenes.

Contribution

It proposes a dual-path transformer model for multimodal ReID and a ReID-based localization framework, validated on custom and public RGB-D datasets, advancing robotic perception capabilities.

Findings

01

ReID accuracy achieved mAP of 75.18

02

Localization success rate of 83% on TUM-RGBD

03

Depth data enhances ReID robustness in challenging scenes

Abstract

Re-identification (ReID) is a critical challenge in computer vision, predominantly studied in the context of pedestrians and vehicles. However, robust object-instance ReID, which has significant implications for tasks such as autonomous exploration, long-term perception, and scene understanding, remains underexplored. In this work, we address this gap by proposing a novel dual-path object-instance re-identification transformer architecture that integrates multimodal RGB and depth information. By leveraging depth data, we demonstrate improvements in ReID across scenes that are cluttered or have varying illumination conditions. Additionally, we develop a ReID-based localization framework that enables accurate camera localization and pose identification across different viewpoints. We validate our methods using two custom-built RGB-D datasets, as well as multiple sequences from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

instance-based-loc/instance-based-loc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Image and Object Detection Techniques

Methods*Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Convolution · Thinned U-shape Module