VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial   Attention

Shengheng Deng; Zhihao Liang; Lin Sun; Kui Jia

arXiv:2203.09704·cs.CV·March 21, 2022

VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention

Shengheng Deng, Zhihao Liang, Lin Sun, Kui Jia

PDF

Open Access 1 Repo

TL;DR

VISTA introduces a novel multi-view fusion module with dual cross-view spatial attention for improved 3D object detection from LiDAR data, significantly enhancing accuracy on autonomous driving benchmarks.

Contribution

The paper proposes a new plug-and-play fusion module with a convolutional attention mechanism and decoupled tasks, advancing multi-view 3D detection performance.

Findings

01

Achieves 63.0% mAP and 69.8% NDS on nuScenes

02

Outperforms existing methods by up to 24% in key categories

03

Demonstrates effectiveness through extensive experiments

Abstract

Detecting objects from LiDAR point clouds is of tremendous significance in autonomous driving. In spite of good progress, accurate and reliable 3D detection is yet to be achieved due to the sparsity and irregularity of LiDAR point clouds. Among existing strategies, multi-view methods have shown great promise by leveraging the more comprehensive information from both bird's eye view (BEV) and range view (RV). These multi-view methods either refine the proposals predicted from single view via fused features, or fuse the features without considering the global spatial context; their performance is limited consequently. In this paper, we propose to adaptively fuse multi-view features in a global spatial context via Dual Cross-VIew SpaTial Attention (VISTA). The proposed VISTA is a novel plug-and-play fusion module, wherein the multi-layer perceptron widely adopted in standard attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gorilla-lab-scut/vista
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods