SparseDet: A Simple and Effective Framework for Fully Sparse LiDAR-based 3D Object Detection
Lin Liu, Ziying Song, Qiming Xia, Feiyang Jia, Caiyan Jia, Lei Yang,, Hongyu Pan

TL;DR
SparseDet introduces a novel sparse query-based framework for LiDAR 3D object detection that effectively aggregates local and global contextual information, achieving state-of-the-art accuracy with high efficiency.
Contribution
The paper proposes SparseDet, which employs sparse queries and new aggregation modules to fully utilize contextual information while maintaining fast inference speeds.
Findings
Outperforms previous sparse detectors on nuScenes and KITTI datasets.
Achieves 13.5 FPS on nuScenes with 2.2% higher mAP.
Achieves 17.9 FPS on KITTI with 1.12% higher AP_{3D}.
Abstract
LiDAR-based sparse 3D object detection plays a crucial role in autonomous driving applications due to its computational efficiency advantages. Existing methods either use the features of a single central voxel as an object proxy, or treat an aggregated cluster of foreground points as an object proxy. However, the former lacks the ability to aggregate contextual information, resulting in insufficient information expression in object proxies. The latter relies on multi-stage pipelines and auxiliary tasks, which reduce the inference speed. To maintain the efficiency of the sparse framework while fully aggregating contextual information, in this work, we propose SparseDet which designs sparse queries as object proxies. It introduces two key modules, the Local Multi-scale Feature Aggregation (LMFA) module and the Global Feature Aggregation (GFA) module, aiming to fully capture the contextual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
