Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Yingping Liang; Yutao Hu; Wenqi Shao; Ying Fu

arXiv:2507.00392·cs.CV·July 8, 2025

Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Yingping Liang, Yutao Hu, Wenqi Shao, Ying Fu

PDF

Open Access

TL;DR

This paper introduces a two-stage framework called Lift to Match (L2M) that enhances feature matching by lifting 2D images into 3D space, improving generalization across diverse scenarios using large-scale synthetic data.

Contribution

The paper presents a novel 3D-aware feature encoder and a view rendering strategy that together enable robust, domain-generalizable feature matching from single-view images.

Findings

01

Outperforms existing methods on zero-shot benchmarks.

02

Achieves superior generalization across diverse domains.

03

Effectively leverages synthetic data for training.

Abstract

Feature matching plays a fundamental role in many computer vision tasks, yet existing methods heavily rely on scarce and clean multi-view image collections, which constrains their generalization to diverse and challenging scenarios. Moreover, conventional feature encoders are typically trained on single-view 2D images, limiting their capacity to capture 3D-aware correspondences. In this paper, we propose a novel two-stage framework that lifts 2D images to 3D space, named as \textbf{Lift to Match (L2M)}, taking full advantage of large-scale and diverse single-view images. To be specific, in the first stage, we learn a 3D-aware feature encoder using a combination of multi-view image synthesis and 3D feature Gaussian representation, which injects 3D geometry knowledge into the encoder. In the second stage, a novel-view rendering strategy, combined with large-scale synthetic data generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging