ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection
Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy,, Cheng-Hao Kuo, Min Sun

TL;DR
ImGeoNet introduces a geometry-aware voxel representation for multi-view 3D object detection that improves accuracy and data efficiency by leveraging multi-view images and geometry induction, outperforming existing methods on multiple datasets.
Contribution
The paper presents a novel image-induced geometry-aware voxel representation that enhances multi-view 3D detection accuracy and efficiency, utilizing only images during inference and leveraging pre-trained 2D features.
Findings
Outperforms state-of-the-art ImVoxelNet on three datasets.
Achieves comparable results with fewer views, demonstrating data efficiency.
Surpasses point cloud-based VoteNet in sparse, noisy, and small object scenarios.
Abstract
We propose ImGeoNet, a multi-view image-based 3D object detection framework that models a 3D space by an image-induced geometry-aware voxel representation. Unlike previous methods which aggregate 2D features into 3D voxels without considering geometry, ImGeoNet learns to induce geometry from multi-view images to alleviate the confusion arising from voxels of free space, and during the inference phase, only images from multiple views are required. Besides, a powerful pre-trained 2D feature extractor can be leveraged by our representation, leading to a more robust performance. To evaluate the effectiveness of ImGeoNet, we conduct quantitative and qualitative experiments on three indoor datasets, namely ARKitScenes, ScanNetV2, and ScanNet200. The results demonstrate that ImGeoNet outperforms the current state-of-the-art multi-view image-based method, ImVoxelNet, on all three datasets in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage
