Is Pseudo-Lidar needed for Monocular 3D Object detection?

Dennis Park; Rares Ambrus; Vitor Guizilini; Jie Li; Adrien Gaidon

arXiv:2108.06417·cs.CV·August 17, 2021

Is Pseudo-Lidar needed for Monocular 3D Object detection?

Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, Adrien Gaidon

PDF

2 Repos

TL;DR

This paper introduces DD3D, an end-to-end monocular 3D object detector that leverages depth pre-training without the limitations of pseudo-lidar methods, achieving state-of-the-art results on multiple benchmarks.

Contribution

The authors propose a novel single-stage, end-to-end monocular 3D detection architecture that effectively utilizes depth pre-training, improving over pseudo-lidar based methods.

Findings

01

Achieves 16.34% AP for Cars on KITTI-3D

02

Achieves 9.28% AP for Pedestrians on KITTI-3D

03

Attains 41.5% mAP on NuScenes

Abstract

Recent progress in 3D object detection from single images leverages monocular depth estimation as a way to produce 3D pointclouds, turning cameras into pseudo-lidar sensors. These two-stage detectors improve with the accuracy of the intermediate depth estimation network, which can itself be improved without manual labels via large-scale self-supervised learning. However, they tend to suffer from overfitting more than end-to-end methods, are more complex, and the gap with similar lidar-based detectors remains significant. In this work, we propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations. Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.