DeepInteraction: 3D Object Detection via Modality Interaction

Zeyu Yang; Jiaqi Chen; Zhenwei Miao; Wei Li; Xiatian Zhu; Li Zhang

arXiv:2208.11112·cs.CV·December 9, 2022·65 cites

DeepInteraction: 3D Object Detection via Modality Interaction

Zeyu Yang, Jiaqi Chen, Zhenwei Miao, Wei Li, Xiatian Zhu, Li Zhang

PDF

Open Access 2 Repos 1 Video

TL;DR

DeepInteraction introduces a novel modality interaction strategy for 3D object detection that maintains individual modality representations, leading to significant performance improvements on the nuScenes dataset.

Contribution

The paper proposes a new modality interaction approach with a dedicated architecture to better exploit modality-specific information in 3D detection.

Findings

01

Outperforms prior methods on nuScenes dataset

02

Achieves first place on nuScenes leaderboard

03

Demonstrates significant accuracy improvements

Abstract

Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy. This design is however fundamentally restricted due to overlooking the modality-specific useful information and finally hampering the model performance. To address this limitation, in this work we introduce a novel modality interaction strategy where individual per-modality representations are learned and maintained throughout for enabling their unique characteristics to be exploited during object detection. To realize this proposed strategy, we design a DeepInteraction architecture characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder. Experiments on the large-scale nuScenes dataset show that our proposed method surpasses all prior arts often by a large margin. Crucially, our method is ranked at the first position at the highly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

DeepInteraction: 3D Object Detection via Modality Interaction· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Multimodal Machine Learning Applications