Multi-scale Feature Fusion with Point Pyramid for 3D Object Detection
Weihao Lu, Dezong Zhao, Cristiano Premebida, Li Zhang, Wenjing Zhao,, Daxin Tian

TL;DR
This paper introduces POP-RCNN, a multi-scale feature fusion framework for 3D object detection in point clouds, improving feature communication across scales without added complexity, and demonstrating strong results on KITTI and Waymo datasets.
Contribution
It proposes the Point Pyramid RCNN with a novel PPFE module for effective multi-scale feature fusion and a point density confidence module, enhancing 3D detection performance.
Findings
Achieves state-of-the-art results on KITTI and Waymo datasets.
Effective multi-scale feature fusion improves detection of distant objects.
Compatible with existing voxel-based and point-voxel-based frameworks.
Abstract
Effective point cloud processing is crucial to LiDARbased autonomous driving systems. The capability to understand features at multiple scales is required for object detection of intelligent vehicles, where road users may appear in different sizes. Recent methods focus on the design of the feature aggregation operators, which collect features at different scales from the encoder backbone and assign them to the points of interest. While efforts are made into the aggregation modules, the importance of how to fuse these multi-scale features has been overlooked. This leads to insufficient feature communication across scales. To address this issue, this paper proposes the Point Pyramid RCNN (POP-RCNN), a feature pyramid-based framework for 3D object detection on point clouds. POP-RCNN consists of a Point Pyramid Feature Enhancement (PPFE) module to establish connections across spatial scales…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Image and Video Retrieval Techniques
MethodsFocus
