Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection
Zhixin Wang, Kui Jia

TL;DR
This paper introduces Frustum ConvNet, a novel end-to-end method for amodal 3D object detection from point clouds that leverages 2D proposals to generate frustums, aggregate local features, and accurately estimate 3D bounding boxes.
Contribution
The paper presents a new frustum-based convolutional network architecture for 3D detection that is dataset-agnostic and outperforms existing methods on SUN-RGBD and KITTI datasets.
Findings
Outperforms all existing methods on SUN-RGBD.
Achieves state-of-the-art results on KITTI benchmark.
Demonstrates effectiveness of frustum-based feature aggregation.
Abstract
In this work, we propose a novel method termed \emph{Frustum ConvNet (F-ConvNet)} for amodal 3D object detection from point clouds. Given 2D region proposals in an RGB image, our method first generates a sequence of frustums for each region proposal, and uses the obtained frustums to group local points. F-ConvNet aggregates point-wise features as frustum-level feature vectors, and arrays these feature vectors as a feature map for use of its subsequent component of fully convolutional network (FCN), which spatially fuses frustum-level features and supports an end-to-end and continuous estimation of oriented boxes in the 3D space. We also propose component variants of F-ConvNet, including an FCN variant that extracts multi-resolution frustum features, and a refined use of F-ConvNet over a reduced 3D space. Careful ablation studies verify the efficacy of these component variants. F-ConvNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · 3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization
MethodsMax Pooling · Convolution · Fully Convolutional Network
