YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception
Cheng Han, Qichao Zhao, Shuyi Zhang, Yinzi Chen, Zhenlin Zhang, Jinwei, Yuan

TL;DR
This paper introduces YOLOPv2, a multi-task learning network that enhances real-time panoptic driving perception by improving accuracy and speed, achieving state-of-the-art results on BDD100K with significantly reduced inference time.
Contribution
YOLOPv2 presents a novel, efficient multi-task network for traffic detection, segmentation, and lane detection, outperforming previous models in accuracy and speed.
Findings
Achieved state-of-the-art performance on BDD100K dataset.
Reduced inference time by 50% compared to previous SOTA.
Effectively balances high accuracy with real-time efficiency.
Abstract
Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance. It has become a popular paradigm when designing networks for real-time practical autonomous driving system, where computation resources are limited. This paper proposed an effective and efficient multi-task learning network to simultaneously perform the task of traffic object detection, drivable road area segmentation and lane detection. Our model achieved the new state-of-the-art (SOTA) performance in terms of accuracy and speed on the challenging BDD100K dataset. Especially, the inference time is reduced by half compared to the previous SOTA model. Code will be released in the near future.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
