YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception

Cheng Han; Qichao Zhao; Shuyi Zhang; Yinzi Chen; Zhenlin Zhang; Jinwei; Yuan

arXiv:2208.11434·cs.CV·August 25, 2022·52 cites

YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception

Cheng Han, Qichao Zhao, Shuyi Zhang, Yinzi Chen, Zhenlin Zhang, Jinwei, Yuan

PDF

Open Access 2 Repos

TL;DR

This paper introduces YOLOPv2, a multi-task learning network that enhances real-time panoptic driving perception by improving accuracy and speed, achieving state-of-the-art results on BDD100K with significantly reduced inference time.

Contribution

YOLOPv2 presents a novel, efficient multi-task network for traffic detection, segmentation, and lane detection, outperforming previous models in accuracy and speed.

Findings

01

Achieved state-of-the-art performance on BDD100K dataset.

02

Reduced inference time by 50% compared to previous SOTA.

03

Effectively balances high accuracy with real-time efficiency.

Abstract

Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance. It has become a popular paradigm when designing networks for real-time practical autonomous driving system, where computation resources are limited. This paper proposed an effective and efficient multi-task learning network to simultaneously perform the task of traffic object detection, drivable road area segmentation and lane detection. Our model achieved the new state-of-the-art (SOTA) performance in terms of accuracy and speed on the challenging BDD100K dataset. Especially, the inference time is reduced by half compared to the previous SOTA model. Code will be released in the near future.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings