3M3D: Multi-view, Multi-path, Multi-representation for 3D Object   Detection

Jongwoo Park; Apoorv Singh; Varun Bankiti

arXiv:2302.08231·cs.CV·July 31, 2023

3M3D: Multi-view, Multi-path, Multi-representation for 3D Object Detection

Jongwoo Park, Apoorv Singh, Varun Bankiti

PDF

Open Access

TL;DR

The paper introduces 3M3D, a novel 3D object detection method that updates multi-view and query features through multi-attention mechanisms, significantly improving performance on autonomous driving benchmarks.

Contribution

It proposes a multi-view, multi-path, multi-representation framework that enhances scene understanding by updating features with self-attention and multi-representation queries.

Findings

01

Improves 3D detection accuracy on nuScenes dataset.

02

Enhances global and local scene understanding through multi-view feature updates.

03

Achieves performance gains over baseline models.

Abstract

3D visual perception tasks based on multi-camera images are essential for autonomous driving systems. Latest work in this field performs 3D object detection by leveraging multi-view images as an input and iteratively enhancing object queries (object proposals) by cross-attending multi-view features. However, individual backbone features are not updated with multi-view features and it stays as a mere collection of the output of the single-image backbone network. Therefore we propose 3M3D: A Multi-view, Multi-path, Multi-representation for 3D Object Detection where we update both multi-view features and query features to enhance the representation of the scene in both fine panoramic view and coarse global view. Firstly, we update multi-view features by multi-view axis self-attention. It will incorporate panoramic information in the multi-view features and enhance understanding of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization