RMT-PPAD: Real-time Multi-task Learning for Panoptic Perception in Autonomous Driving
Jiayuan Wang, Q. M. Jonathan Wu, Katsuya Suto, and Ning Zhang

TL;DR
This paper introduces RMT-PPAD, a real-time transformer-based multi-task model for panoptic perception in autonomous driving, achieving state-of-the-art accuracy and efficiency on the BDD100K dataset.
Contribution
The work presents a lightweight, adaptive multi-task learning framework with novel modules for feature fusion and segmentation, addressing negative transfer and label inconsistency issues.
Findings
Achieves 84.9% mAP50 in object detection
Attains 92.6% mIoU in drivable area segmentation
Runs at 32.6 FPS in real-time
Abstract
Autonomous driving systems rely on panoptic driving perception that requires both precision and real-time performance. In this work, we propose RMT-PPAD, a real-time, transformer-based multi-task model that jointly performs object detection, drivable area segmentation, and lane line segmentation. We introduce a lightweight module, a gate control with an adapter to adaptively fuse shared and task-specific features, effectively alleviating negative transfer between tasks. Additionally, we design an adaptive segmentation decoder to learn the weights over multi-scale features automatically during the training stage. This avoids the manual design of task-specific structures for different segmentation tasks. We also identify and resolve the inconsistency between training and testing labels in lane line segmentation. This allows fairer evaluation. Experiments on the BDD100K dataset demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Domain Adaptation and Few-Shot Learning
