GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object   Detection

Jinqing Zhang; Yanan Zhang; Yunlong Qi; Zehua Fu; Qingjie Liu; Yunhong; Wang

arXiv:2409.01816·cs.CV·December 24, 2024

GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection

Jinqing Zhang, Yanan Zhang, Yunlong Qi, Zehua Fu, Qingjie Liu, Yunhong, Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

GeoBEV introduces a high-resolution BEV representation with novel sampling and labeling techniques to enhance multi-view 3D object detection accuracy, achieving state-of-the-art results on nuScenes.

Contribution

The paper proposes Radial-Cartesian BEV Sampling and In-Box Labeling to improve geometric fidelity in BEV representations for 3D detection.

Findings

01

Outperforms existing methods in BEV resolution and geometric accuracy

02

Achieves 66.2% NDS on nuScenes test set

03

Introduces novel loss and sampling strategies for better 3D scene understanding

Abstract

Bird's-Eye-View (BEV) representation has emerged as a mainstream paradigm for multi-view 3D object detection, demonstrating impressive perceptual capabilities. However, existing methods overlook the geometric quality of BEV representation, leaving it in a low-resolution state and failing to restore the authentic geometric information of the scene. In this paper, we identify the drawbacks of previous approaches that limit the geometric quality of BEV representation and propose Radial-Cartesian BEV Sampling (RC-Sampling), which outperforms other feature transformation methods in efficiently generating high-resolution dense BEV representation to restore fine-grained geometric information. Additionally, we design a novel In-Box Label to substitute the traditional depth label generated from the LiDAR points. This label reflects the actual geometric structure of objects rather than just their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mengtan00/geobev
pytorchOfficial

Videos

GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection· underline

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization