SGM3D: Stereo Guided Monocular 3D Object Detection
Zheyuan Zhou, Liang Du, Xiaoqing Ye, Zhikang Zou, Xiao Tan, Li Zhang,, Xiangyang Xue, Jianfeng Feng

TL;DR
SGM3D introduces a stereo-guided monocular 3D detection framework that leverages stereo features and domain adaptation techniques to improve accuracy without relying on additional depth sensors.
Contribution
The paper proposes a novel stereo-guided monocular 3D detection method with multi-granularity domain adaptation and IoU-based object alignment, enhancing monocular detection performance.
Findings
Achieves state-of-the-art results on KITTI and Lyft datasets.
Effectively utilizes stereo features to guide monocular 3D detection.
Demonstrates robustness without extra depth sensors.
Abstract
Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image. It poses a great challenge due to its ill-posed property which is critically lack of depth information in the 2D image plane. While there exist approaches leveraging off-the-shelve depth estimation or relying on LiDAR sensors to mitigate this problem, the dependence on the additional depth model or expensive equipment severely limits their scalability to generic 3D perception. In this paper, we propose a stereo-guided monocular 3D object detection framework, dubbed SGM3D, adapting the robust 3D features learned from stereo inputs to enhance the feature for monocular detection. We innovatively present a multi-granularity domain adaptation (MG-DA) mechanism to exploit the network's ability to generate stereo-mimicking features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning
