SDVRF: Sparse-to-Dense Voxel Region Fusion for Multi-modal 3D Object   Detection

Binglu Ren; Jianqin Yin

arXiv:2304.08304·cs.CV·September 19, 2023·1 cites

SDVRF: Sparse-to-Dense Voxel Region Fusion for Multi-modal 3D Object Detection

Binglu Ren, Jianqin Yin

PDF

Open Access

TL;DR

This paper introduces SDVRF, a novel multi-modal 3D object detection method that dynamically fuses sparse LiDAR point clouds with dense image features using voxel regions, improving detection especially for small objects.

Contribution

The paper proposes a dynamic voxel region concept and a multi-scale fusion framework to enhance multi-modal 3D detection by better aligning and densifying feature fusion.

Findings

01

Improves detection accuracy on KITTI dataset, especially for small objects.

02

Outperforms baseline methods in multi-modal 3D object detection.

03

Enhances feature alignment and fusion density through dynamic regions.

Abstract

In the perception task of autonomous driving, multi-modal methods have become a trend due to the complementary characteristics of LiDAR point clouds and image data. However, the performance of multi-modal methods is usually limited by the sparsity of the point cloud or the noise problem caused by the misalignment between LiDAR and the camera. To solve these two problems, we present a new concept, Voxel Region (VR), which is obtained by projecting the sparse local point clouds in each voxel dynamically. And we propose a novel fusion method named Sparse-to-Dense Voxel Region Fusion (SDVRF). Specifically, more pixels of the image feature map inside the VR are gathered to supplement the voxel feature extracted from sparse points and achieve denser fusion. Meanwhile, different from prior methods, which project the size-fixed grids, our strategy of generating dynamic regions achieves better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Video Surveillance and Tracking Methods