RoIFusion: 3D Object Detection from LiDAR and Vision
Can Chen, Luca Zanotti Fragonara, and Antonios Tsourdos

TL;DR
RoIFusion introduces a novel deep neural network architecture that effectively fuses LIDAR and camera data for 3D object detection, achieving state-of-the-art results on the KITTI benchmark.
Contribution
The paper proposes a new fusion algorithm that projects 3D RoIs to 2D image RoIs, improving 3D detection accuracy by leveraging complementary sensor information.
Findings
Achieves state-of-the-art performance on KITTI benchmark.
Efficient fusion method improves 3D detection accuracy.
Utilizes projection-based fusion to handle sparse point clouds.
Abstract
When localizing and detecting 3D objects for autonomous driving scenes, obtaining information from multiple sensor (e.g. camera, LIDAR) typically increases the robustness of 3D detectors. However, the efficient and effective fusion of different features captured from LIDAR and camera is still challenging, especially due to the sparsity and irregularity of point cloud distributions. This notwithstanding, point clouds offer useful complementary information. In this paper, we would like to leverage the advantages of LIDAR and camera sensors by proposing a deep neural network architecture for the fusion and the efficient detection of 3D objects by identifying their corresponding 3D bounding boxes with orientation. In order to achieve this task, instead of densely combining the point-wise feature of the point cloud and the related pixel features, we propose a novel fusion algorithm by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Optical Sensing Technologies
