CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection

Dhanalaxmi Gaddam; Jean Lahoud; Fahad Shahbaz Khan; Rao Muhammad; Anwer; Hisham Cholakkal

arXiv:2209.06641·cs.CV·September 15, 2022

CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection

Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Muhammad, Anwer, Hisham Cholakkal

PDF

Open Access

TL;DR

CMR3D introduces a multi-stage refinement framework that explicitly leverages scene context at various levels to improve 3D object detection and counting, demonstrating significant performance gains on ScanNetV2.

Contribution

The paper presents a novel CMR3D framework that integrates multi-level scene context and multi-stage refinement for enhanced 3D object detection.

Findings

01

Achieves 2.0% improvement over baseline on ScanNetV2

02

Effectively enhances 3D object counting accuracy

03

Demonstrates the benefit of contextual information in 3D detection

Abstract

Existing deep learning-based 3D object detectors typically rely on the appearance of individual objects and do not explicitly pay attention to the rich contextual information of the scene. In this work, we propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework, which takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene at multiple levels to predict a set of object bounding-boxes along with their corresponding semantic labels. To this end, we propose to utilize a context enhancement network that captures the contextual information at different levels of granularity followed by a multi-stage refinement module to progressively refine the box positions and class predictions. Extensive experiments on the large-scale ScanNetV2 benchmark reveal the benefits of our proposed method, leading to an absolute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Robotics and Sensor-Based Localization