PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation
Zidong Cao, Jinjing Zhu, Weiming Zhang, Hao Ai, Haotian Bai,, Hengshuang Zhao, Lin Wang

TL;DR
This paper introduces PanDA, a panoramic depth estimation model trained via semi-supervised learning with M"obius transformations, demonstrating strong zero-shot performance and robustness to distortions in panoramic images.
Contribution
It proposes a novel semi-supervised framework with M"obius spatial augmentation for panoramic depth estimation, addressing the limitations of existing depth models on panoramic images.
Findings
PanDA outperforms existing methods on real-world benchmarks.
PanDA exhibits strong zero-shot generalization across diverse scenes.
M"obius transformation-based augmentation improves robustness to distortions.
Abstract
Recently, Depth Anything Models (DAMs) - a type of depth foundation models - have demonstrated impressive zero-shot capabilities across diverse perspective images. Despite its success, it remains an open question regarding DAMs' performance on panorama images that enjoy a large field-of-view (180x360) but suffer from spherical distortions. To address this gap, we conduct an empirical analysis to evaluate the performance of DAMs on panoramic images and identify their limitations. For this, we undertake comprehensive experiments to assess the performance of DAMs from three key factors: panoramic representations, 360 camera positions for capturing scenarios, and spherical spatial transformations. This way, we reveal some key findings, e.g., DAMs are sensitive to spatial transformations. We then propose a semi-supervised learning (SSL) framework to learn a panoramic DAM, dubbed PanDA. Under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction
