IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a Single Image
Dingfu Zhou, Xibin Song, Yuchao Dai, Junbo Yin, Feixiang Lu, Jin Fang,, Miao Liao, Liangjun Zhang

TL;DR
This paper introduces IAFA, an instance-aware feature aggregation module that enhances 3D object detection accuracy from single images by effectively combining local and global features, achieving state-of-the-art results on KITTI.
Contribution
The paper proposes a novel instance-aware feature aggregation module and demonstrates its effectiveness in improving 3D detection accuracy from single images.
Findings
Significant boost in 3D detection accuracy on KITTI benchmark.
Outperforms all existing single image-based 3D detection methods.
Effective use of coarse instance annotations for spatial attention learning.
Abstract
3D object detection from a single image is an important task in Autonomous Driving (AD), where various approaches have been proposed. However, the task is intrinsically ambiguous and challenging as single image depth estimation is already an ill-posed problem. In this paper, we propose an instance-aware approach to aggregate useful information for improving the accuracy of 3D object detection with the following contributions. First, an instance-aware feature aggregation (IAFA) module is proposed to collect local and global features for 3D bounding boxes regression. Second, we empirically find that the spatial attention module can be well learned by taking coarse-level instance annotations as a supervision signal. The proposed module has significantly boosted the performance of the baseline method on both 3D detection and 2D bird-eye's view of vehicle detection among all three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
MethodsConvolution · Sigmoid Activation · Max Pooling · Average Pooling · Communication--Guide||How Do I Communicate to Expedia?
