VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
Yin Zhou, Oncel Tuzel

TL;DR
VoxelNet is an end-to-end deep learning framework that directly processes raw 3D point clouds by dividing them into voxels, enabling improved 3D object detection without manual feature engineering, outperforming previous methods.
Contribution
The paper introduces VoxelNet, a novel unified network that combines feature extraction and detection in a single end-to-end trainable model for 3D point cloud data.
Findings
Outperforms state-of-the-art LiDAR 3D detection methods on KITTI benchmark
Learns discriminative features for various object geometries
Effective in detecting pedestrians and cyclists using only LiDAR data
Abstract
Accurate detection of objects in 3D point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. To interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird's eye view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 3D voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VFE) layer. In this way, the point cloud is encoded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Remote Sensing and LiDAR Applications · Robotics and Sensor-Based Localization
MethodsRegion Proposal Network
