PanMatch: Unleashing the Potential of Large Vision Models for Unified Matching Models

Yongjian Zhang; Longguang Wang; Kunhong Li; Ye Zhang; Yun Wang; Liang Lin; Yulan Guo

arXiv:2507.08400·cs.CV·July 14, 2025

PanMatch: Unleashing the Potential of Large Vision Models for Unified Matching Models

Yongjian Zhang, Longguang Wang, Kunhong Li, Ye Zhang, Yun Wang, Liang Lin, Yulan Guo

PDF

Open Access

TL;DR

PanMatch introduces a unified large vision model capable of addressing various two-frame correspondence matching tasks within a single framework, eliminating the need for task-specific architectures and enabling zero-shot cross-view matching across diverse domains.

Contribution

The paper proposes PanMatch, a versatile foundation model that unifies multiple matching tasks using a single displacement estimation framework and a robust feature extractor, trained on a large cross-domain dataset.

Findings

01

Outperforms UniMatch and Flow-Anything on cross-task evaluations.

02

Achieves comparable results to state-of-the-art task-specific algorithms.

03

Demonstrates strong zero-shot performance in challenging scenarios like rainy weather and satellite images.

Abstract

This work presents PanMatch, a versatile foundation model for robust correspondence matching. Unlike previous methods that rely on task-specific architectures and domain-specific fine-tuning to support tasks like stereo matching, optical flow or feature matching, our key insight is that any two-frame correspondence matching task can be addressed within a 2D displacement estimation framework using the same model weights. Such a formulation eliminates the need for designing specialized unified architectures or task-specific ensemble models. Instead, it achieves multi-task integration by endowing displacement estimation algorithms with unprecedented generalization capabilities. To this end, we highlight the importance of a robust feature extractor applicable across multiple domains and tasks, and propose the feature transformation pipeline that leverage all-purpose features from Large…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning