DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation
Xiankang He, Guangkai Xu, Bo Zhang, Hao Chen, Ying Cui, Dongyan Guo

TL;DR
This paper introduces DiffCalib, a novel approach that reformulates monocular camera calibration as a dense incident map generation task using diffusion models, achieving state-of-the-art accuracy and benefiting 3D vision applications.
Contribution
It leverages pre-trained diffusion models to estimate camera intrinsics and depth maps jointly, providing a robust, generalizable, and high-accuracy calibration method.
Findings
Achieves up to 40% reduction in prediction errors.
Outperforms existing methods on multiple datasets.
Enhances 3D reconstruction from single images.
Abstract
Monocular camera calibration is a key precondition for numerous 3D vision applications. Despite considerable advancements, existing methods often hinge on specific assumptions and struggle to generalize across varied real-world scenarios, and the performance is limited by insufficient training data. Recently, diffusion models trained on expansive datasets have been confirmed to maintain the capability to generate diverse, high-quality images. This success suggests a strong potential of the models to effectively understand varied visual information. In this work, we leverage the comprehensive visual knowledge embedded in pre-trained diffusion models to enable more robust and accurate monocular camera intrinsic estimation. Specifically, we reformulate the problem of estimating the four degrees of freedom (4-DoF) of camera intrinsic parameters as a dense incident map generation task. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Image and Object Detection Techniques · Satellite Image Processing and Photogrammetry
MethodsDiffusion
