Scatter Points in Space: 3D Detection from Multi-view Monocular Images
Jianlin Liu, Zhuofei Huang, Dihe Huang, Shang Xu, Ying Chen, and Yong, Liu

TL;DR
This paper introduces a novel learnable keypoints sampling method for multi-view 3D object detection from monocular images, improving data efficiency and detection accuracy by scattering pseudo surface points and employing geometric constraints.
Contribution
It proposes a learnable keypoints sampling technique and a surface filter module to enhance multi-view feature aggregation and noise suppression in 3D detection from monocular images.
Findings
Achieves over 0.1 AP improvement on ScanNet categories.
Outperforms previous methods in 3D detection accuracy.
Effectively maintains data sparsity with scattered pseudo surface points.
Abstract
3D object detection from monocular image(s) is a challenging and long-standing problem of computer vision. To combine information from different perspectives without troublesome 2D instance tracking, recent methods tend to aggregate multiview feature by sampling regular 3D grid densely in space, which is inefficient. In this paper, we attempt to improve multi-view feature aggregation by proposing a learnable keypoints sampling method, which scatters pseudo surface points in 3D space, in order to keep data sparsity. The scattered points augmented by multi-view geometric constraints and visual features are then employed to infer objects location and shape in the scene. To make up the limitations of single frame and model multi-view geometry explicitly, we further propose a surface filter module for noise suppression. Experimental results show that our method achieves significantly better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · 3D Surveying and Cultural Heritage
