Improved Pillar with Fine-grained Feature for 3D Object Detection
Jiahui Fu, Guanghui Ren, Yunpeng Chen, Si Liu

TL;DR
This paper enhances 3D object detection from LiDAR data by introducing a fine-grained feature approach based on PointPillar, improving accuracy while maintaining efficiency through novel modules and encoding techniques.
Contribution
It proposes height-aware sub-pillar and sparsity-based tiny-pillar modules that improve feature representation in 3D detection, surpassing previous methods in accuracy.
Findings
Significantly outperforms previous state-of-the-art on Waymo dataset
Achieves better accuracy without sacrificing speed
Introduces novel height encoding and sparse attention modules
Abstract
3D object detection with LiDAR point clouds plays an important role in autonomous driving perception module that requires high speed, stability and accuracy. However, the existing point-based methods are challenging to reach the speed requirements because of too many raw points, and the voxel-based methods are unable to ensure stable speed because of the 3D sparse convolution. In contrast, the 2D grid-based methods, such as PointPillar, can easily achieve a stable and efficient speed based on simple 2D convolution, but it is hard to get the competitive accuracy limited by the coarse-grained point clouds representation. So we propose an improved pillar with fine-grained feature based on PointPillar that can significantly improve detection accuracy. It consists of two modules, including height-aware sub-pillar and sparsity-based tiny-pillar, which get fine-grained representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
