DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture
Akshaya Athwale, Ichrak Shili, \'Emile Bergeron, Ola Ahmad and, Jean-Fran\c{c}ois Lalonde

TL;DR
DarSwin-Unet is a novel encoder-decoder model designed for pixel-level tasks on wide-angle fisheye images, effectively handling distortions and enabling zero-shot adaptation to unseen lens distortions.
Contribution
The paper introduces DarSwin-Unet, a U-Net architecture with a radial transformer that adapts to lens distortions and improves pixel-level task performance in fisheye images.
Findings
Achieves superior results on multiple datasets compared to baselines.
Effectively handles various levels of distortion, including out-of-distribution cases.
Demonstrates zero-shot adaptation to unseen lens distortions.
Abstract
Wide-angle fisheye images are becoming increasingly common for perception tasks in applications such as robotics, security, and mobility (e.g. drones, avionics). However, current models often either ignore the distortions in wide-angle images or are not suitable to perform pixel-level tasks. In this paper, we present an encoder-decoder model based on a radial transformer architecture that adapts to distortions in wide-angle lenses by leveraging the physical characteristics defined by the radial distortion profile. In contrast to the original model, which only performs classification tasks, we introduce a U-Net architecture, DarSwin-Unet, designed for pixel level tasks. Furthermore, we propose a novel strategy that minimizes sparsity when sampling the image for creating its input tokens. Our approach enhances the model capability to handle pixel-level tasks in wide-angle fisheye images,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Concatenated Skip Connection · Convolution · U-Net
