Signal: Selective Interaction and Global-local Alignment for Multi-Modal Object Re-Identification

Yangyang Liu; Yuhao Wang; Pingping Zhang

arXiv:2511.17965·cs.CV·November 25, 2025

Signal: Selective Interaction and Global-local Alignment for Multi-Modal Object Re-Identification

Yangyang Liu, Yuhao Wang, Pingping Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces Signal, a multi-modal object Re-ID framework that employs selective interaction and global-local alignment modules to enhance feature discriminability and reduce background interference, validated on three benchmarks.

Contribution

The paper proposes a novel framework with selective interaction and alignment modules, improving multi-modal feature discrimination and consistency in object Re-ID tasks.

Findings

01

Outperforms existing methods on RGBNT201, RGBNT100, MSVR310 benchmarks.

02

Effective in reducing background interference and enhancing feature discriminability.

03

Demonstrates significant improvements in multi-modal object Re-ID accuracy.

Abstract

Multi-modal object Re-IDentification (ReID) is devoted to retrieving specific objects through the exploitation of complementary multi-modal image information. Existing methods mainly concentrate on the fusion of multi-modal features, yet neglecting the background interference. Besides, current multi-modal fusion methods often focus on aligning modality pairs but suffer from multi-modal consistency alignment. To address these issues, we propose a novel selective interaction and global-local alignment framework called Signal for multi-modal object ReID. Specifically, we first propose a Selective Interaction Module (SIM) to select important patch tokens with intra-modal and inter-modal information. These important patch tokens engage in the interaction with class tokens, thereby yielding more discriminative features. Then, we propose a Global Alignment Module (GAM) to simultaneously align…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Signal: Selective Interaction and Global-local Alignment for Multi-Modal Object Re-Identification· underline

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Visual Attention and Saliency Detection