LeCoT: revisiting network architecture for two-view correspondence pruning
Luanyuan Dai, Xiaoyu Du, Jinhui Tang

TL;DR
LeCoT introduces a novel transformer-based network for two-view correspondence pruning that effectively captures global context information without extra modules, outperforming existing methods across multiple vision tasks.
Contribution
We propose LeCoT, a new network architecture with a Spatial-Channel Fusion Transformer and a progressive prediction block for improved correspondence pruning.
Findings
LeCoT outperforms state-of-the-art methods in correspondence pruning.
LeCoT improves accuracy in relative pose and homography estimation.
LeCoT enhances visual localization and 3D reconstruction results.
Abstract
Two-view correspondence pruning aims to accurately remove incorrect correspondences (outliers) from initial ones and is widely applied to various computer vision tasks. Current popular strategies adopt multilayer perceptron (MLP) as the backbone, supplemented by additional modules to enhance the network ability to handle context information, which is a known limitation of MLPs. In contrast, we introduce a novel perspective for capturing correspondence context information without extra design modules. To this end, we design a two-view correspondence pruning network called LeCoT, which can naturally leverage global context information at different stages. Specifically, the core design of LeCoT is the Spatial-Channel Fusion Transformer block, a newly proposed component that efficiently utilizes both spatial and channel global context information among sparse correspondences. In addition,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
