UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang; Nikhil Keetha; Chenwei Lyu; Bhuvan Jhamb; Yutian Chen; Yuheng Qiu; Jay Karhade; Shreyas Jha; Yaoyu Hu; Deva Ramanan; Sebastian Scherer; Wenshan Wang

arXiv:2506.09278·cs.CV·February 11, 2026

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang, Nikhil Keetha, Chenwei Lyu, Bhuvan Jhamb, Yutian Chen, Yuheng Qiu, Jay Karhade, Shreyas Jha, Yaoyu Hu, Deva Ramanan, Sebastian Scherer, Wenshan Wang

PDF

Open Access 5 Models

TL;DR

UFM introduces a unified transformer-based model trained on combined data to improve dense correspondence accuracy and speed across various applications, outperforming specialized methods.

Contribution

The paper presents UFM, a simple, unified transformer model trained on combined data for dense correspondence, outperforming specialized approaches in accuracy and speed.

Findings

01

UFM is 28% more accurate than state-of-the-art flow methods.

02

UFM has 62% less error and is 6.7x faster than dense wide-baseline matchers.

03

Unified training enables superior performance across multiple correspondence tasks.

Abstract

Dense image correspondence is central to many applications, such as visual odometry, 3D reconstruction, object association, and re-identification. Historically, dense correspondence has been tackled separately for wide-baseline scenarios and optical flow estimation, despite the common goal of matching content between two images. In this paper, we develop a Unified Flow & Matching model (UFM), which is trained on unified data for pixels that are co-visible in both source and target images. UFM uses a simple, generic transformer architecture that directly regresses the (u,v) flow. It is easier to train and more accurate for large flows compared to the typical coarse-to-fine cost volumes in prior work. UFM is 28% more accurate than state-of-the-art flow methods (Unimatch), while also having 62% less error and 6.7x faster than dense wide-baseline matchers (RoMa). UFM is the first to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization