End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom, Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan

TL;DR
This paper introduces a novel multi-view fusion approach for 3D object detection in LiDAR point clouds, combining bird's-eye and perspective views through dynamic voxelization to improve detection accuracy, especially for small or distant objects.
Contribution
The paper proposes a new end-to-end multi-view fusion algorithm with dynamic voxelization that enhances feature learning from multiple perspectives in LiDAR data.
Findings
Significant accuracy improvements over single-view baselines.
Effective fusion of bird's-eye and perspective view features.
Robust detection performance on Waymo and KITTI datasets.
Abstract
Recent work on 3D object detection advocates point cloud voxelization in birds-eye view, where objects preserve their physical dimensions and are naturally separable. When represented in this view, however, point clouds are sparse and have highly variable point density, which may cause detectors difficulties in detecting distant or small objects (pedestrians, traffic signs, etc.). On the other hand, perspective view provides dense observations, which could allow more favorable feature encoding for such cases. In this paper, we aim to synergize the birds-eye view and the perspective view and propose a novel end-to-end multi-view fusion (MVF) algorithm, which can effectively learn to utilize the complementary information from both. Specifically, we introduce dynamic voxelization, which has four merits compared to existing voxelization methods, i) removing the need of pre-allocating a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Video Surveillance and Tracking Methods
