Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang,, Feng Zhao

TL;DR
Graph-DETR3D improves multi-view 3D object detection by addressing border region challenges through graph structure learning and a depth-invariant multi-scale training strategy, achieving state-of-the-art results on nuScenes.
Contribution
It introduces a graph-based feature aggregation method and a depth-invariant training strategy to enhance detection performance at image borders.
Findings
Achieves 49.5 NDS on nuScenes test leaderboard.
Effectively enhances border region detection performance.
Outperforms existing image-view 3D detectors.
Abstract
3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding. Due to its low cost and high efficiency, multi-view 3D object detection has demonstrated promising application prospects. However, accurately detecting objects through perspective views in the 3D space is extremely difficult due to the lack of depth information. Recently, DETR3D introduces a novel 3D-2D query paradigm in aggregating multi-view images for 3D object detection and achieves state-of-the-art performance. In this paper, with intensive pilot experiments, we quantify the objects located at different regions and find that the "truncated instances" (i.e., at the border regions of each image) are the main bottleneck hindering the performance of DETR3D. Although it merges multiple features from two adjacent views in the overlapping regions, DETR3D still suffers from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection
