Single-Frame Point-Pixel Registration via Supervised Cross-Modal Feature Matching

Yu Han; Zhiwei Huang; Yanting Zhang; Fangjun Ding; Shen Cai; Rui Fan

arXiv:2506.22784·cs.CV·July 1, 2025

Single-Frame Point-Pixel Registration via Supervised Cross-Modal Feature Matching

Yu Han, Zhiwei Huang, Yanting Zhang, Fangjun Ding, Shen Cai, Rui Fan

PDF

Open Access

TL;DR

This paper introduces a novel detector-free, projection-based method for direct point-pixel registration between LiDAR point clouds and camera images, effectively handling sparsity and noise in single-frame data without multi-frame accumulation.

Contribution

It proposes a detector-free cross-modal matching framework with a repeatability scoring mechanism, advancing point-pixel registration in sparse single-frame LiDAR scenarios.

Findings

01

Achieves state-of-the-art results on KITTI, nuScenes, and MIAS-LCEC-TF70 benchmarks.

02

Outperforms prior methods even those using accumulated point clouds.

03

Demonstrates robustness under sparse and noisy LiDAR inputs.

Abstract

Point-pixel registration between LiDAR point clouds and camera images is a fundamental yet challenging task in autonomous driving and robotic perception. A key difficulty lies in the modality gap between unstructured point clouds and structured images, especially under sparse single-frame LiDAR settings. Existing methods typically extract features separately from point clouds and images, then rely on hand-crafted or learned matching strategies. This separate encoding fails to bridge the modality gap effectively, and more critically, these methods struggle with the sparsity and noise of single-frame LiDAR, often requiring point cloud accumulation or additional priors to improve reliability. Inspired by recent progress in detector-free matching paradigms (e.g. MatchAnything), we revisit the projection-based approach and introduce the detector-free framework for direct point-pixel matching…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · 3D Shape Modeling and Analysis