MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors
Fanqi Pu, Yifan Wang, Jiru Deng, Wenming Yang

TL;DR
MonoDGP introduces a Transformer-based approach for monocular 3D object detection that leverages geometry errors and decoupled queries to improve accuracy without extra data, achieving state-of-the-art results on KITTI.
Contribution
The paper proposes MonoDGP, a novel monocular 3D detection method using geometry-error priors and decoupled queries, enhancing depth estimation and 2D priors without multi-depth branches.
Findings
Achieves state-of-the-art performance on KITTI benchmark.
Effectively leverages geometry errors as an alternative to multi-depth prediction.
Uses a decoupled 2D decoder to improve visual feature utilization.
Abstract
Perspective projection has been extensively utilized in monocular 3D object detection methods. It introduces geometric priors from 2D bounding boxes and 3D object dimensions to reduce the uncertainty of depth estimation. However, due to depth errors originating from the object's visual surface, the height of the bounding box often fails to represent the actual projected central height, which undermines the effectiveness of geometric depth. Direct prediction for the projected height unavoidably results in a loss of 2D priors, while multi-depth prediction with complex branches does not fully leverage geometric depth. This paper presents a Transformer-based monocular 3D object detection method called MonoDGP, which adopts perspective-invariant geometry errors to modify the projection formula. We also try to systematically discuss and explain the mechanisms and efficacy behind geometry…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Image and Object Detection Techniques
