ConQueR: Query Contrast Voxel-DETR for 3D Object Detection
Benjin Zhu, Zhe Wang, Shaoshuai Shi, Hang Xu, Lanqing Hong, Hongsheng, Li

TL;DR
ConQueR introduces a contrastive query mechanism in a sparse 3D detector to significantly reduce false positives and improve detection accuracy, achieving state-of-the-art results on the Waymo dataset.
Contribution
The paper proposes a novel Query Contrast mechanism for sparse 3D detection, explicitly discriminating queries to enhance accuracy and reduce false positives.
Findings
Reduces false positives by up to 60%.
Achieves 71.6 mAPH/L2 on Waymo dataset, surpassing previous methods.
Outperforms PV-RCNN++ by over 2.0 mAPH/L2.
Abstract
Although DETR-based 3D detectors can simplify the detection pipeline and achieve direct sparse predictions, their performance still lags behind dense detectors with post-processing for 3D object detection from point clouds. DETRs usually adopt a larger number of queries than GTs (e.g., 300 queries v.s. 40 objects in Waymo) in a scene, which inevitably incur many false positives during inference. In this paper, we propose a simple yet effective sparse 3D detector, named Query Contrast Voxel-DETR (ConQueR), to eliminate the challenging false positives, and achieve more accurate and sparser predictions. We observe that most false positives are highly overlapping in local regions, caused by the lack of explicit supervision to discriminate locally similar queries. We thus propose a Query Contrast mechanism to explicitly enhance queries towards their best-matched GTs over all unmatched query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Medical Imaging and Analysis
MethodsGoal-Driven Tree-Structured Neural Model
