TL;DR
MultiNet introduces a unified, real-time architecture for joint semantic classification, detection, and segmentation in autonomous driving, achieving high accuracy and efficiency on the KITTI dataset.
Contribution
It presents a simple, end-to-end trainable model that shares an encoder across tasks, enabling real-time performance for autonomous driving applications.
Findings
Outperforms state-of-the-art in road segmentation on KITTI
Processes all tasks in under 100 ms
Efficient joint reasoning for autonomous driving
Abstract
While most approaches to semantic reasoning have focused on improving performance, in this paper we argue that computational times are very important in order to enable real time applications such as autonomous driving. Towards this goal, we present an approach to joint classification, detection and semantic segmentation via a unified architecture where the encoder is shared amongst the three tasks. Our approach is very simple, can be trained end-to-end and performs extremely well in the challenging KITTI dataset, outperforming the state-of-the-art in the road segmentation task. Our approach is also very efficient, taking less than 100 ms to perform all tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods1-Dimensional Convolutional Neural Networks
