PanMatch: Unleashing the Potential of Large Vision Models for Unified Matching Models
Yongjian Zhang, Longguang Wang, Kunhong Li, Ye Zhang, Yun Wang, Liang Lin, Yulan Guo

TL;DR
PanMatch introduces a unified large vision model capable of addressing various two-frame correspondence matching tasks within a single framework, eliminating the need for task-specific architectures and enabling zero-shot cross-view matching across diverse domains.
Contribution
The paper proposes PanMatch, a versatile foundation model that unifies multiple matching tasks using a single displacement estimation framework and a robust feature extractor, trained on a large cross-domain dataset.
Findings
Outperforms UniMatch and Flow-Anything on cross-task evaluations.
Achieves comparable results to state-of-the-art task-specific algorithms.
Demonstrates strong zero-shot performance in challenging scenarios like rainy weather and satellite images.
Abstract
This work presents PanMatch, a versatile foundation model for robust correspondence matching. Unlike previous methods that rely on task-specific architectures and domain-specific fine-tuning to support tasks like stereo matching, optical flow or feature matching, our key insight is that any two-frame correspondence matching task can be addressed within a 2D displacement estimation framework using the same model weights. Such a formulation eliminates the need for designing specialized unified architectures or task-specific ensemble models. Instead, it achieves multi-task integration by endowing displacement estimation algorithms with unprecedented generalization capabilities. To this end, we highlight the importance of a robust feature extractor applicable across multiple domains and tasks, and propose the feature transformation pipeline that leverage all-purpose features from Large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
