RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation
Cheng Guan, Chunyu Lin, Zhijie Shen, Junsong Zhang, Jiyuan Wang

TL;DR
RePer-360 introduces a novel distortion-aware self-modulation framework that effectively adapts depth foundation models for 360° panoramic depth estimation, outperforming standard fine-tuning with significantly less data.
Contribution
It proposes a lightweight geometry-aligned guidance module and Self-Conditioned AdaLN-Zero mechanism to adapt pretrained models to panoramic images without overwriting perspective priors.
Findings
Surpasses standard fine-tuning methods using only 1% of training data.
Achieves approximately 20% RMSE improvement in in-domain settings.
Enhances training stability and cross-projection alignment with cubemap-domain consistency loss.
Abstract
Recent depth foundation models trained on perspective imagery achieve strong performance, yet generalize poorly to 360 images due to the substantial geometric discrepancy between perspective and panoramic domains. Moreover, fully fine-tuning these models typically requires large amounts of panoramic data. To address this issue, we propose RePer-360, a distortion-aware self-modulation framework for monocular panoramic depth estimation that adapts depth foundation models while preserving powerful pretrained perspective priors. Specifically, we design a lightweight geometry-aligned guidance module to derive a modulation signal from two complementary projections (i.e., ERP and CP) and use it to guide the model toward the panoramic domain without overwriting its pretrained perspective knowledge. We further introduce a Self-Conditioned AdaLN-Zero mechanism that produces pixel-wise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis
