SphereFusion: Efficient Panorama Depth Estimation via Gated Fusion
Qingsong Yan, Qiang Wang, Kaiyong Zhao, Jie Chen, Bo Li, Xiaowen Chu,, Fei Deng

TL;DR
SphereFusion is an end-to-end framework that combines multiple projection methods and a gated fusion module to efficiently estimate panorama depth with high accuracy and speed, overcoming challenges of distortion and texture loss.
Contribution
It introduces SphereFusion, a novel method that fuses features from different projections for improved panorama depth estimation, achieving state-of-the-art speed and competitive accuracy.
Findings
Achieves the fastest inference speed of 17 ms on 512×1024 images.
Demonstrates competitive depth estimation accuracy on three public datasets.
Effectively handles distortion and texture loss issues in panorama images.
Abstract
Due to the rapid development of panorama cameras, the task of estimating panorama depth has attracted significant attention from the computer vision community, especially in applications such as robot sensing and autonomous driving. However, existing methods relying on different projection formats often encounter challenges, either struggling with distortion and discontinuity in the case of equirectangular, cubemap, and tangent projections, or experiencing a loss of texture details with the spherical projection. To tackle these concerns, we present SphereFusion, an end-to-end framework that combines the strengths of various projection methods. Specifically, SphereFusion initially employs 2D image convolution and mesh operations to extract two distinct types of features from the panorama image in both equirectangular and spherical projection domains. These features are then projected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques
MethodsSoftmax · Attention Is All You Need · Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
