Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning
Yang Jiao, Zequn Jie, Shaoxiang Chen, Lechao Cheng, Jingjing Chen, Lin, Ma, Yu-Gang Jiang

TL;DR
This paper introduces IA-BEV, a novel approach that enhances multi-camera 3D object detection by integrating instance awareness and structural priors into depth estimation, leading to improved BEV perception for autonomous driving.
Contribution
It presents a category-specific structural priors mining method and a self-boosting learning strategy to improve depth estimation accuracy in BEV-based 3D detection.
Findings
Achieves state-of-the-art results on nuScenes benchmark.
Enhances depth estimation quality for better 3D detection.
Demonstrates effectiveness of structural priors and self-boosting strategies.
Abstract
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field. Under such a paradigm, accurate BEV representation construction relies on reliable depth estimation for multi-camera images. However, existing approaches exhaustively predict depths for every pixel without prioritizing objects, which are precisely the entities requiring detection in the 3D space. To this end, we propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector. First, a category-specific structural priors mining approach is proposed for enhancing the efficacy of monocular depth generation. Besides, a self-boosting learning strategy is further proposed to encourage the model to place more emphasis on challenging objects in computation-expensive temporal stereo matching. Together they provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications
