UniCorrn: Unified Correspondence Transformer Across 2D and 3D

Prajnan Goswami; Tianye Ding; Feng Liu; Huaizu Jiang

arXiv:2605.04044·cs.CV·May 6, 2026

UniCorrn: Unified Correspondence Transformer Across 2D and 3D

Prajnan Goswami, Tianye Ding, Feng Liu, Huaizu Jiang

PDF

1 Repo

TL;DR

UniCorrn introduces a unified Transformer-based model for geometric correspondence across 2D images and 3D point clouds, enabling flexible, end-to-end learning for multiple modalities.

Contribution

It is the first shared-weight model that unifies 2D-2D, 2D-3D, and 3D-3D correspondence tasks using a dual-stream Transformer architecture.

Findings

01

Achieves competitive 2D-2D matching performance.

02

Surpasses state-of-the-art by 8% on 7Scenes (2D-3D).

03

Surpasses state-of-the-art by 10% on 3DLoMatch (3D-3D).

Abstract

Visual correspondence across image-to-image (2D-2D), image-to-point cloud (2D-3D), and point cloud-to-point cloud (3D-3D) geometric matching forms the foundation for numerous 3D vision tasks. Despite sharing a similar problem structure, current methods use task-specific designs with separate models for each modality combination. We present UniCorrn, the first correspondence model with shared weights that unifies geometric matching across all three tasks. Our key insight is that Transformer attention naturally captures cross-modal feature similarity. We propose a dual-stream decoder that maintains separate appearance and positional feature streams. This design enables end-to-end learning through stack-able layers while supporting flexible query-based correspondence estimation across heterogeneous modalities. Our architecture employs modality-specific backbones followed by shared encoder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://neu-vi.github.io/UniCorrn
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.