TL;DR
BlitzNet is a deep neural network designed for real-time scene understanding that jointly performs object detection and semantic segmentation, achieving state-of-the-art results efficiently.
Contribution
It introduces a unified architecture that combines detection and segmentation tasks in one network, improving accuracy and speed over prior separate models.
Findings
Achieves real-time performance on VOC and COCO datasets.
Outperforms existing real-time systems in detection and segmentation accuracy.
Demonstrates mutual benefits of joint task learning.
Abstract
Real-time scene understanding has become crucial in many applications such as autonomous driving. In this paper, we propose a deep architecture, called BlitzNet, that jointly performs object detection and semantic segmentation in one forward pass, allowing real-time computations. Besides the computational gain of having a single network to perform several tasks, we show that object detection and semantic segmentation benefit from each other in terms of accuracy. Experimental results for VOC and COCO datasets show state-of-the-art performance for object detection and segmentation among real time systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
