Unified Spherical Frontend: Learning Rotation-Equivariant Representations of Spherical Images from Any Camera
Mukai Yu, Mosam Dabhi, Liuyue Xie, Sebastian Scherer, L\'aszl\'o A. Jeni

TL;DR
The paper introduces USF, a flexible, rotation-equivariant spherical image processing framework that works across various camera types without harmonic transforms, improving robustness and generalization in perception tasks.
Contribution
USF provides a modular, distortion-free, lens-agnostic approach for spherical image processing that achieves rotation-equivariance without harmonic transforms, enabling efficient high-resolution analysis.
Findings
USF maintains less than 1% performance drop under random rotations.
USF scales efficiently to high-resolution spherical imagery.
USF enables zero-shot generalization to unseen wide-FoV lenses.
Abstract
Modern perception increasingly relies on fisheye, panoramic, and other wide field-of-view (FoV) cameras, yet most pipelines still apply planar CNNs designed for pinhole imagery on 2D grids, where pixel-space neighborhoods misrepresent physical adjacency and models are sensitive to global rotations. Traditional spherical CNNs partially address this mismatch but require costly spherical harmonic transform that constrains resolution and efficiency. We present Unified Spherical Frontend (USF), a distortion-free lens-agnostic framework that transforms images from any calibrated camera onto the unit sphere via ray-direction correspondences, and performs spherical resampling, convolution, and pooling canonically in the spatial domain. USF is modular: projection, location sampling, value interpolation, and resolution control are fully decoupled. Its configurable distance-only convolution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
