Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens
Suchisrit Gangopadhyay, Jung-Hee Kim, Xien Chen, Patrick Rim, Hyoungseob Park, Alex Wong

TL;DR
This paper introduces a novel calibration token mechanism that allows existing monocular depth estimators trained on perspective images to be effectively adapted for fisheye cameras without retraining, by aligning latent embeddings.
Contribution
The authors propose Calibration Tokens, a lightweight, self-supervised method to adapt FMDEs for fisheye images by aligning latent embeddings, avoiding retraining or fine-tuning.
Findings
Consistently improves depth estimation accuracy on indoor and outdoor datasets.
Requires only large-scale perspective datasets and no fisheye images for training.
Outperforms state-of-the-art methods with a unified token approach.
Abstract
We propose a method to extend foundational monocular depth estimators (FMDEs), trained on perspective images, to fisheye images. Despite being trained on tens of millions of images, FMDEs are susceptible to the covariate shift introduced by changes in camera calibration (intrinsic, distortion) parameters, leading to erroneous depth estimates. Our method aligns the distribution of latent embeddings encoding fisheye images to those of perspective images, enabling the reuse of FMDEs for fisheye cameras without retraining or finetuning. To this end, we introduce a set of Calibration Tokens as a light-weight adaptation mechanism that modulates the latent embeddings for alignment. By exploiting the already expressive latent space of FMDEs, we posit that modulating their embeddings avoids the negative impact of artifacts and loss introduced in conventional recalibration or map projection to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing and LiDAR Applications · Satellite Image Processing and Photogrammetry · Advanced Vision and Imaging
