MonoEdge: Monocular 3D Object Detection Using Local Perspectives
Minghan Zhu, Lingting Ge, Panqu Wang, Huei Peng

TL;DR
MonoEdge introduces a novel local perspective module for monocular 3D object detection, improving accuracy by modeling local shape distortions independently of camera parameters, and enhances existing frameworks with superior performance.
Contribution
The paper presents a new local perspective module that captures local shape distortions, enabling monocular 3D detection to incorporate local perspective effects independently of camera intrinsic parameters.
Findings
Outperforms strong baseline methods on multiple datasets.
Effectively models local shape distortions for better depth and yaw estimation.
Integrates seamlessly with existing monocular 3D detection frameworks.
Abstract
We propose a novel approach for monocular 3D object detection by leveraging local perspective effects of each object. While the global perspective effect shown as size and position variations has been exploited for monocular 3D detection extensively, the local perspectives has long been overlooked. We design a local perspective module to regress a newly defined variable named keyedge-ratios as the parameterization of the local shape distortion to account for the local perspective, and derive the object depth and yaw angle from it. Theoretically, this module does not rely on the pixel-wise size or position in the image of the objects, therefore independent of the camera intrinsic parameters. By plugging this module in existing monocular 3D object detection frameworks, we incorporate the local perspective distortion with global perspective effect for monocular 3D reasoning, and we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
MonoEdge: Monocular 3D Object Detection Using Local Perspectives· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
