UniVision: A Unified Framework for Vision-Centric 3D Perception
Yu Hong, Qian Liu, Huayuan Cheng, Danjiao Ma, Hang Dai, Yu Wang,, Guangzhi Cao, Yong Ding

TL;DR
UniVision is a unified, efficient framework that advances vision-centric 3D perception by integrating occupancy prediction and object detection with novel feature transformation and fusion modules, achieving state-of-the-art results.
Contribution
The paper introduces a novel unified framework with explicit-implicit view transform, local-global feature fusion, and multi-task training strategies for 3D perception.
Findings
Achieves state-of-the-art results on four benchmarks
Improves accuracy in occupancy prediction and object detection
Demonstrates effective multi-task learning stability
Abstract
The past few years have witnessed the rapid development of vision-centric 3D perception in autonomous driving. Although the 3D perception models share many structural and conceptual similarities, there still exist gaps in their feature representations, data formats, and objectives, posing challenges for unified and efficient 3D perception framework design. In this paper, we present UniVision, a simple and efficient framework that unifies two major tasks in vision-centric 3D perception, \ie, occupancy prediction and object detection. Specifically, we propose an explicit-implicit view transform module for complementary 2D-3D feature transformation. We propose a local-global feature extraction and fusion module for efficient and adaptive voxel and BEV feature extraction, enhancement, and interaction. Further, we propose a joint occupancy-detection data augmentation strategy and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Robotics and Sensor-Based Localization
