BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection

Junjie Huang; Guan Huang

arXiv:2203.17054·cs.CV·June 17, 2022·156 cites

BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection

Junjie Huang, Guan Huang

PDF

Open Access 1 Repo

TL;DR

BEVDet4D introduces a temporal-aware extension to the BEVDet framework, leveraging multi-frame data to significantly improve 3D object detection accuracy and velocity estimation in multi-camera systems.

Contribution

The paper proposes BEVDet4D, a novel 4D spatial-temporal framework that enhances multi-camera 3D detection by incorporating temporal cues with minimal additional computation.

Findings

01

Reduces velocity error by up to 62.9%.

02

Achieves 54.5% NDS on nuScenes, surpassing previous methods.

03

Enables vision-based detection to be comparable with LiDAR/radar in velocity estimation.

Abstract

Single frame data contains finite information which limits the performance of the existing vision-based multi-camera 3D object detection paradigms. For fundamentally pushing the performance boundary in this area, a novel paradigm dubbed BEVDet4D is proposed to lift the scalable BEVDet paradigm from the spatial-only 3D space to the spatial-temporal 4D space. We upgrade the naive BEVDet framework with a few modifications just for fusing the feature from the previous frame with the corresponding one in the current frame. In this way, with negligible additional computing budget, we enable BEVDet4D to access the temporal cues by querying and comparing the two candidate features. Beyond this, we simplify the task of velocity prediction by removing the factors of ego-motion and time in the learning target. As a result, BEVDet4D with robust generalization performance reduces the velocity error…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HuangJunJie2017/BEVDet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Optical Sensing Technologies