Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

Dihe Huang; Ying Chen; Yikang Ding; Jinli Liao; Jianlin Liu; Kai Wu,; Qiang Nie; Yong Liu; Chengjie Wang; Zhiheng Li

arXiv:2209.09464·cs.CV·January 30, 2023·6 cites

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu,, Qiang Nie, Yong Liu, Chengjie Wang, Zhiheng Li

PDF

Open Access

TL;DR

This paper introduces MDRNet, a novel 3D object detection network that reduces information loss during BEV feature extraction by dynamically focusing on valuable spatial features and fusing multi-level spatial information.

Contribution

The paper proposes a Multi-level feature dimensionality reduction strategy with SDR and MSR modules to improve 3D detection accuracy by preserving more spatial information.

Findings

01

Outperforms state-of-the-art methods on nuScenes dataset

02

Effective preservation of 3D spatial information during BEV feature extraction

03

Demonstrates significant accuracy improvements over existing approaches

Abstract

Bird's eye view (BEV) is widely adopted by most of the current point cloud detectors due to the applicability of well-explored 2D detection techniques. However, existing methods obtain BEV features by simply collapsing voxel or point features along the height dimension, which causes the heavy loss of 3D spatial information. To alleviate the information loss, we propose a novel point cloud detection network based on a Multi-level feature dimensionality reduction strategy, called MDRNet. In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation. Furthermore, the Multi-level Spatial Residuals (MSR) is proposed to fuse the multi-level spatial information in the BEV feature maps. Extensive experiments on nuScenes show that the proposed method outperforms the state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection