Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework

Hao Jing; Anhong Wang; Lijun Zhao; Yakun Yang; Donghan Bu; Jing Zhang; Yifan Zhang; Junhui Hou

arXiv:2407.05769·cs.CV·June 11, 2025

Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework

Hao Jing, Anhong Wang, Lijun Zhao, Yakun Yang, Donghan Bu, Jing Zhang, Yifan Zhang, Junhui Hou

PDF

Open Access

TL;DR

This paper introduces a semantic-aware multi-branch framework for 3D object detection in autonomous driving, improving detection accuracy by incorporating semantic features and multi-view consistency in LiDAR point cloud processing.

Contribution

It proposes a novel multi-branch two-stage detection framework with a Semantic-aware Multi-branch Sampling module and multi-view consistency constraints, enhancing detection especially for low-performance backbones.

Findings

01

Significant performance improvements on KITTI and Waymo datasets.

02

Enhanced detection of distant objects and non-ground points.

03

Effective for various backbone network structures.

Abstract

In autonomous driving, LiDAR sensors are vital for acquiring 3D point clouds, providing reliable geometric information. However, traditional sampling methods of preprocessing often ignore semantic features, leading to detail loss and ground point interference in 3D object detection. To address this, we propose a multi-branch two-stage 3D object detection framework using a Semantic-aware Multi-branch Sampling (SMS) module and multi-view consistency constraints. The SMS module includes random sampling, Density Equalization Sampling (DES) for enhancing distant objects, and Ground Abandonment Sampling (GAS) to focus on non-ground points. The sampled multi-view points are processed through a Consistent KeyPoint Selection (CKPS) module to generate consistent keypoint masks for efficient proposal sampling. The first-stage detector uses multi-branch parallel learning with multi-view consistency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Image Processing and 3D Reconstruction

MethodsFocus