Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object   Detection

Junjie Huang; Yun Ye; Zhujin Liang; Yi Shan; and Dalong Du

arXiv:2311.07152·cs.CV·November 14, 2023·1 cites

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang, Yi Shan, and Dalong Du

PDF

Open Access 1 Repo

TL;DR

This paper introduces DAL, a novel LiDAR-camera fusion paradigm for 3D object detection that reduces overfitting and improves performance by mimicking data annotation, offering a simple, portable, and effective baseline.

Contribution

The paper proposes DAL, a new paradigm that imitates data annotation for LiDAR-camera fusion, significantly enhancing detection performance and robustness.

Findings

01

DAL achieves superior speed-accuracy trade-off.

02

DAL substantially improves detection performance.

03

The method is simple, portable, and effective.

Abstract

3D object Detection with LiDAR-camera encounters overfitting in algorithm development which is derived from the violation of some fundamental rules. We refer to the data annotation in dataset construction for theory complementing and argue that the regression task prediction should not involve the feature from the camera branch. By following the cutting-edge perspective of 'Detecting As Labeling', we propose a novel paradigm dubbed DAL. With the most classical elementary algorithms, a simple predicting pipeline is constructed by imitating the data annotation process. Then we train it in the simplest way to minimize its dependency and strengthen its portability. Though simple in construction and training, the proposed DAL paradigm not only substantially pushes the performance boundary but also provides a superior trade-off between speed and accuracy among all existing methods. With…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HuangJunJie2017/BEVDet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Industrial Vision Systems and Defect Detection

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings