ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for   Autonomous Driving

Tao Ma; Hongbin Zhou; Qiusheng Huang; Xuemeng Yang; Jianfei Guo; Bo; Zhang; Min Dou; Yu Qiao; Botian Shi; Hongsheng Li

arXiv:2411.05311·cs.CV·November 11, 2024

ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving

Tao Ma, Hongbin Zhou, Qiusheng Huang, Xuemeng Yang, Jianfei Guo, Bo, Zhang, Min Dou, Yu Qiao, Botian Shi, Hongsheng Li

PDF

Open Access

TL;DR

ZOPP introduces a zero-shot, multi-modal framework for offboard panoptic perception in autonomous driving, enabling auto-labeling and recognition beyond traditional closed-set methods, validated on the Waymo dataset.

Contribution

The paper presents the first multi-modal zero-shot offboard panoptic perception framework for autonomous driving scenes, integrating vision foundation models with 3D point cloud data.

Findings

01

Effective zero-shot recognition in autonomous driving scenes.

02

High-quality auto-labeling across multiple perception tasks.

03

Potential for real-world application demonstrated through downstream experiments.

Abstract

Offboard perception aims to automatically generate high-quality 3D labels for autonomous driving (AD) scenes. Existing offboard methods focus on 3D object detection with closed-set taxonomy and fail to match human-level recognition capability on the rapidly evolving perception tasks. Due to heavy reliance on human labels and the prevalence of data imbalance and sparsity, a unified framework for offboard auto-labeling various elements in AD scenes that meets the distinct needs of perception tasks is not being fully explored. In this paper, we propose a novel multi-modal Zero-shot Offboard Panoptic Perception (ZOPP) framework for autonomous driving scenes. ZOPP integrates the powerful zero-shot recognition capabilities of vision foundation models and 3D representations derived from point clouds. To the best of our knowledge, ZOPP represents a pioneering effort in the domain of multi-modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Autonomous Vehicle Technology and Safety

MethodsFocus