Single-Frame Point-Pixel Registration via Supervised Cross-Modal Feature Matching
Yu Han, Zhiwei Huang, Yanting Zhang, Fangjun Ding, Shen Cai, Rui Fan

TL;DR
This paper introduces a novel detector-free, projection-based method for direct point-pixel registration between LiDAR point clouds and camera images, effectively handling sparsity and noise in single-frame data without multi-frame accumulation.
Contribution
It proposes a detector-free cross-modal matching framework with a repeatability scoring mechanism, advancing point-pixel registration in sparse single-frame LiDAR scenarios.
Findings
Achieves state-of-the-art results on KITTI, nuScenes, and MIAS-LCEC-TF70 benchmarks.
Outperforms prior methods even those using accumulated point clouds.
Demonstrates robustness under sparse and noisy LiDAR inputs.
Abstract
Point-pixel registration between LiDAR point clouds and camera images is a fundamental yet challenging task in autonomous driving and robotic perception. A key difficulty lies in the modality gap between unstructured point clouds and structured images, especially under sparse single-frame LiDAR settings. Existing methods typically extract features separately from point clouds and images, then rely on hand-crafted or learned matching strategies. This separate encoding fails to bridge the modality gap effectively, and more critically, these methods struggle with the sparsity and noise of single-frame LiDAR, often requiring point cloud accumulation or additional priors to improve reliability. Inspired by recent progress in detector-free matching paradigms (e.g. MatchAnything), we revisit the projection-based approach and introduce the detector-free framework for direct point-pixel matching…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · 3D Shape Modeling and Analysis
