V$^{2}$-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence

Jiancheng Pan; Runze Wang; Tianwen Qian; Mohammad Mahdi; Yanwei Fu; Xiangyang Xue; Xiaomeng Huang; Luc Van Gool; Danda Pani Paudel; Yuqian Fu

arXiv:2511.20886·cs.CV·April 9, 2026

V$^{2}$-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence

Jiancheng Pan, Runze Wang, Tianwen Qian, Mohammad Mahdi, Yanwei Fu, Xiangyang Xue, Xiaomeng Huang, Luc Van Gool, Danda Pani Paudel, Yuqian Fu

PDF

1 Models

TL;DR

V2-SAM is a novel framework that extends SAM2 for cross-view object correspondence by integrating geometry-aware and appearance-guided prompts, achieving state-of-the-art results across multiple datasets.

Contribution

The paper introduces V2-SAM, combining prompt generators and a multi-expert system with cyclic consistency for improved cross-view object correspondence.

Findings

01

Achieves new state-of-the-art on Ego-Exo4D, DAVIS-2017, and HANDAL-X datasets.

02

Effectively combines geometry-aware and appearance-guided prompts.

03

Demonstrates the benefit of adaptive expert selection via cyclic consistency.

Abstract

Cross-view object correspondence, exemplified by the representative task of ego-exo object correspondence, aims to establish consistent associations of the same object across different viewpoints (e.g., egocentric and exocentric). This task poses significant challenges due to drastic viewpoint and appearance variations, making existing segmentation models, such as SAM2, difficult to apply directly. To address this, we present V2-SAM, a unified cross-view object correspondence framework that adapts SAM2 from single-view segmentation to cross-view correspondence through two complementary prompt generators. Specifically, the Cross-View Anchor Prompt Generator (V2-Anchor), built upon DINOv3 features, establishes geometry-aware correspondences and, for the first time, enables coordinate-based prompting for SAM2 in cross-view scenarios, while the Cross-View Visual Prompt Generator (V2-Visual)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
jaychempan/V2-SAM
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.