Native-Domain Cross-Attention for Camera-LiDAR Extrinsic Calibration Under Large Initial Perturbations

Ni Ou; Zhuo Chen; Xinru Zhang; Junzheng Wang

arXiv:2603.29414·cs.CV·April 1, 2026

Native-Domain Cross-Attention for Camera-LiDAR Extrinsic Calibration Under Large Initial Perturbations

Ni Ou, Zhuo Chen, Xinru Zhang, Junzheng Wang

PDF

1 Repo

TL;DR

This paper introduces a novel cross-attention framework for camera-LiDAR extrinsic calibration that directly aligns native domain features, improving robustness under large initial misalignments.

Contribution

It proposes an extrinsic-aware cross-attention mechanism that models cross-modal correspondences without relying on depth map projections, enhancing calibration accuracy and robustness.

Findings

01

Outperforms state-of-the-art methods on KITTI and nuScenes benchmarks.

02

Achieves 88% accurate calibration under large perturbations in KITTI.

03

Achieves 99% accuracy under large perturbations in nuScenes.

Abstract

Accurate camera-LiDAR fusion relies on precise extrinsic calibration, which fundamentally depends on establishing reliable cross-modal correspondences under potentially large misalignments. Existing learning-based methods typically project LiDAR points into depth maps for feature fusion, which distorts 3D geometry and degrades performance when the extrinsic initialization is far from the ground truth. To address this issue, we propose an extrinsic-aware cross-attention framework that directly aligns image patches and LiDAR point groups in their native domains. The proposed attention mechanism explicitly injects extrinsic parameter hypotheses into the correspondence modeling process, enabling geometry-consistent cross-modal interaction without relying on projected 2D depth maps. Extensive experiments on the KITTI and nuScenes benchmarks demonstrate that our method consistently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gitouni/ProjFusion
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.