TL;DR
OmniDet introduces a multi-task perception network for surround view fisheye cameras in autonomous driving, jointly handling six tasks with a novel distortion adaptation and polygon-based object detection, achieving state-of-the-art results.
Contribution
The paper presents a multi-task network with a shared encoder and synergized decoders, incorporating a novel fisheye distortion adaptation mechanism and polygonal object detection for improved perception.
Findings
Outperforms single-task models in all six perception tasks.
Achieves state-of-the-art depth and pose estimation on KITTI.
Demonstrates effective training on diverse global datasets with different camera setups.
Abstract
Surround View fisheye cameras are commonly deployed in automated driving for 360\deg{} near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. We demonstrate that the jointly trained model performs better than the respective single task versions. Our multi-task model has a shared encoder providing a significant computational advantage and has synergized decoders where tasks support each other. We propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion model both at training and inference. This was crucial to enable training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
