3D Object Detection Combining Semantic and Geometric Features from Point Clouds
Hao Peng, Guofeng Tong, Zheng Li, Yaqi Wang, Yuyuan Shao

TL;DR
This paper introduces SGNet, a novel two-stage 3D object detector that combines voxel-based and point-based features, improving detection accuracy especially for small objects in point cloud scenes.
Contribution
The paper proposes a Voxel-Point-Based Module (VTPM) and a Confidence Adjustment Module (CAM) that enhance 3D detection without relying on preset anchors, achieving state-of-the-art results.
Findings
SGNet ranks 1st in cyclist detection on KITTI dataset.
VTPM effectively captures semantic and geometric features.
CAM improves confidence-region alignment in detection.
Abstract
In this paper, we investigate the combination of voxel-based methods and point-based methods, and propose a novel end-to-end two-stage 3D object detector named SGNet for point clouds scenes. The voxel-based methods voxelize the scene to regular grids, which can be processed with the current advanced feature learning frameworks based on convolutional layers for semantic feature learning. Whereas the point-based methods can better extract the geometric feature of the point due to the coordinate reservations. The combination of the two is an effective solution for 3D object detection from point clouds. However, most current methods use a voxel-based detection head with anchors for final classification and localization. Although the preset anchors cover the entire scene, it is not suitable for point clouds detection tasks with larger scenes and multiple categories due to the limitation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Advanced Neural Network Applications · Robotics and Sensor-Based Localization
