MambaPro: Multi-Modal Object Re-Identification with Mamba Aggregation   and Synergistic Prompt

Yuhao Wang; Xuehu Liu; Tianyu Yan; Yang Liu; Aihua Zheng; and Pingping Zhang; Huchuan Lu

arXiv:2412.10707·cs.CV·December 17, 2024

MambaPro: Multi-Modal Object Re-Identification with Mamba Aggregation and Synergistic Prompt

Yuhao Wang, Xuehu Liu, Tianyu Yan, Yang Liu, Aihua Zheng, and Pingping Zhang, Huchuan Lu

PDF

Open Access 1 Repo 1 Video

TL;DR

MambaPro introduces a novel multi-modal object Re-ID framework that adapts large-scale pre-trained models with advanced aggregation and prompt techniques, achieving robust feature extraction and improved performance on multiple benchmarks.

Contribution

The paper proposes MambaPro, a new framework that adapts CLIP for multi-modal ReID using PFA, SRP, and Mamba Aggregation, addressing sequence length limitations and enhancing feature robustness.

Findings

01

Outperforms existing methods on three benchmarks.

02

Efficiently models interactions between modalities.

03

Extracts more robust features with lower complexity.

Abstract

Multi-modal object Re-IDentification (ReID) aims to retrieve specific objects by utilizing complementary image information from different modalities. Recently, large-scale pre-trained models like CLIP have demonstrated impressive performance in traditional single-modal object ReID tasks. However, they remain unexplored for multi-modal object ReID. Furthermore, current multi-modal aggregation methods have obvious limitations in dealing with long sequences from different modalities. To address above issues, we introduce a novel framework called MambaPro for multi-modal object ReID. To be specific, we first employ a Parallel Feed-Forward Adapter (PFA) for adapting CLIP to multi-modal object ReID. Then, we propose the Synergistic Residual Prompt (SRP) to guide the joint learning of multi-modal features. Finally, leveraging Mamba's superior scalability for long sequences, we introduce Mamba…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

924973292/mambapro
pytorchOfficial

Videos

MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt· underline

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · Contrastive Language-Image Pre-training · Adapter