HEAL-SWIN: A Vision Transformer On The Sphere
Oscar Carlsson, Jan E. Gerken, Hampus Linander, Heiner Spie{\ss},, Fredrik Ohlsson, Christoffer Petersson, Daniel Persson

TL;DR
HEAL-SWIN introduces a novel vision transformer that operates directly on spherical data using the HEALPix grid, effectively addressing projection and distortion issues in fisheye image analysis for robotics.
Contribution
The paper presents HEAL-SWIN, a new spherical vision transformer combining HEALPix grid with SWIN transformer for efficient high-resolution spherical image processing.
Findings
Outperforms existing methods on synthetic and real datasets
Effective for semantic segmentation, depth regression, and classification
Operates with minimal computational overhead on spherical data
Abstract
High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchical Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, enabling the network to process spherical representations with minimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Advanced Image and Video Retrieval Techniques
