LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global   Cross-Modal Fusion

Xin Li; Tao Ma; Yuenan Hou; Botian Shi; Yuchen Yang; Youquan Liu,; Xingjiao Wu; Qin Chen; Yikang Li; Yu Qiao; Liang He

arXiv:2303.03595·cs.CV·March 15, 2023·5 cites

LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion

Xin Li, Tao Ma, Yuenan Hou, Botian Shi, Yuchen Yang, Youquan Liu,, Xingjiao Wu, Qin Chen, Yikang Li, Yu Qiao, Liang He

PDF

Open Access 1 Repo

TL;DR

LoGoNet introduces a novel local-to-global fusion approach for LiDAR-camera data, significantly improving 3D object detection accuracy by combining fine-grained region-level and scene-level features.

Contribution

The paper proposes LoGoNet, a fusion network that integrates local and global features using point centroids and grid-based image sampling, advancing multi-modal 3D detection.

Findings

01

Achieves state-of-the-art results on Waymo and KITTI datasets.

02

Ranks 1st on Waymo 3D detection leaderboard.

03

Surpasses 80 APH (L2) on three classes simultaneously.

Abstract

LiDAR-camera fusion methods have shown impressive performance in 3D object detection. Recent advanced multi-modal methods mainly perform global fusion, where image features and point cloud features are fused across the whole scene. Such practice lacks fine-grained region-level information, yielding suboptimal fusion performance. In this paper, we present the novel Local-to-Global fusion network (LoGoNet), which performs LiDAR-camera fusion at both local and global levels. Concretely, the Global Fusion (GoF) of LoGoNet is built upon previous literature, while we exclusively use point centroids to more precisely represent the position of voxel features, thus achieving better cross-modal alignment. As to the Local Fusion (LoF), we first divide each proposal into uniform grids and then project these grid centers to the images. The image features around the projected grid points are sampled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sankin97/logonet
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Robotics and Sensor-Based Localization