TL;DR
This paper introduces FADNet, a novel monocular 3D object detection network that uses sequential feature association and depth hint augmentation to improve accuracy without extra priors.
Contribution
FADNet uniquely divides output estimation into groups based on difficulty and employs a depth hint module for explicit supervision, advancing monocular 3D detection methods.
Findings
Competitive performance on KITTI benchmark
No reliance on depth priors or refinement modules
Maintains real-time processing speed
Abstract
Monocular 3D object detection, with the aim of predicting the geometric properties of on-road objects, is a promising research topic for the intelligent perception systems of autonomous driving. Most state-of-the-art methods follow a keypoint-based paradigm, where the keypoints of objects are predicted and employed as the basis for regressing the other geometric properties. In this work, a unified network named as FADNet is presented to address the task of monocular 3D object detection. In contrast to previous keypoint-based methods, we propose to divide the output modalities into different groups according to the estimation difficulty of object properties. Different groups are treated differently and sequentially associated by a convolutional Gated Recurrent Unit. Another contribution of this work is the strategy of depth hint augmentation. To provide characterized depth patterns as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
