Scalable and Generalizable Correspondence Pruning via Geometry-Consistent Pre-training

Tangfei Liao; Xiaoqin Zhang; Tao Wang; Hao Ye; Min Li; Guobao Xiao; and Mang Ye

arXiv:2406.05773·cs.CV·April 7, 2026

Scalable and Generalizable Correspondence Pruning via Geometry-Consistent Pre-training

Tangfei Liao, Xiaoqin Zhang, Tao Wang, Hao Ye, Min Li, Guobao Xiao, and Mang Ye

PDF

TL;DR

This paper introduces a geometry-consistent pre-training approach for correspondence pruning that enhances robustness and generalization in 3D vision tasks, outperforming existing methods.

Contribution

The authors propose a novel pre-training paradigm with masked inlier reconstruction and a unified encoder, improving correspondence filtering in 3D vision applications.

Findings

01

Achieves over 10% performance gains in camera pose estimation.

02

Outperforms state-of-the-art methods in visual localization.

03

Enhances robustness and generalization across tasks.

Abstract

Two-view correspondence pruning aims to identify reliable correspondences for camera pose estimation, serving as a fundamental step in many 3D vision tasks. Existing methods rely on geometric consistency to seek true correspondences (inliers) from numerous false correspondences (outliers). In this learning paradigm, outliers severely affect the representation learning of inliers, resulting in models that are neither robust nor generalizable. To address this issue, we propose a geometry-consistent pre-training paradigm that sculpts scalable and generalizable representations free from outlier interference. The paradigm features two appealing properties. 1) Implementation of geometry-consistent pre-training. We introduce masked inlier reconstruction as a pretext task and develop a simple yet effective pre-training framework based on a masked autoencoder. Specifically, due to the irregular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.