3DSSD: Point-based 3D Single Stage Object Detector
Zetong Yang, Yanan Sun, Shu Liu, Jiaya Jia

TL;DR
This paper introduces 3DSSD, a lightweight point-based 3D object detector that balances accuracy and efficiency by removing complex upsampling and refinement stages, and employs a novel fusion sampling strategy for better detection.
Contribution
The paper presents a novel single-stage, anchor-free point-based 3D object detection framework that improves speed and accuracy over existing methods.
Findings
Outperforms voxel-based single-stage methods significantly.
Achieves comparable results to two-stage point-based methods.
Runs at over 25 FPS, doubling the speed of previous point-based detectors.
Abstract
Currently, there have been many kinds of voxel-based 3D single stage detectors, while point-based single stage methods are still underexplored. In this paper, we first present a lightweight and effective point-based 3D single stage object detector, named 3DSSD, achieving a good balance between accuracy and efficiency. In this paradigm, all upsampling layers and refinement stage, which are indispensable in all existing point-based methods, are abandoned to reduce the large computation cost. We novelly propose a fusion sampling strategy in downsampling process to make detection on less representative points feasible. A delicate box prediction network including a candidate generation layer, an anchor-free regression head with a 3D center-ness assignment strategy is designed to meet with our demand of accuracy and speed. Our paradigm is an elegant single stage anchor-free framework, showing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
3DSSD: Point-Based 3D Single Stage Object Detector· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · 3DSSD
