PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts
Kun Guo, Qiang Ling

TL;DR
PromptDet is a lightweight 3D object detection framework that effectively combines camera data with minimal LiDAR prompts, improving detection accuracy while maintaining efficiency and flexibility for different inference modes.
Contribution
The paper introduces PromptDet, a novel prompt learning-based framework that integrates LiDAR signals into camera-based 3D detection with minimal additional parameters, enhancing performance and flexibility.
Findings
Significant mAP and NDS improvements with multi-modal fusion.
Effective camera-only detection with minimal performance loss.
Fewer than 2% extra parameters compared to baseline.
Abstract
Multi-camera 3D object detection aims to detect and localize objects in 3D space using multiple cameras, which has attracted more attention due to its cost-effectiveness trade-off. However, these methods often struggle with the lack of accurate depth estimation caused by the natural weakness of the camera in ranging. Recently, multi-modal fusion and knowledge distillation methods for 3D object detection have been proposed to solve this problem, which are time-consuming during the training phase and not friendly to memory cost. In light of this, we propose PromptDet, a lightweight yet effective 3D object detection framework motivated by the success of prompt learning in 2D foundation model. Our proposed framework, PromptDet, comprises two integral components: a general camera-based detection module, exemplified by models like BEVDet and BEVDepth, and a LiDAR-assisted prompter. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Industrial Vision Systems and Defect Detection
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training · Knowledge Distillation
