D-Align: Dual Query Co-attention Network for 3D Object Detection Based   on Multi-frame Point Cloud Sequence

Junhyung Lee; Junho Koh; Youngwoo Lee; Jun Won Choi

arXiv:2210.00087·cs.CV·October 4, 2022·1 cites

D-Align: Dual Query Co-attention Network for 3D Object Detection Based on Multi-frame Point Cloud Sequence

Junhyung Lee, Junho Koh, Youngwoo Lee, Jun Won Choi

PDF

Open Access 1 Repo

TL;DR

D-Align introduces a dual-query co-attention network that leverages multi-frame point cloud sequences to enhance 3D object detection accuracy by aligning and aggregating spatio-temporal features.

Contribution

It proposes a novel dual-query co-attention network for effectively integrating temporal information in 3D object detection from point cloud sequences.

Findings

01

Significantly outperforms existing 3D detectors on nuScenes dataset.

02

Improves detection accuracy over single-frame baseline methods.

03

Effectively aligns and aggregates features from multiple frames.

Abstract

LiDAR sensors are widely used for 3D object detection in various mobile robotics applications. LiDAR sensors continuously generate point cloud data in real-time. Conventional 3D object detectors detect objects using a set of points acquired over a fixed duration. However, recent studies have shown that the performance of object detection can be further enhanced by utilizing spatio-temporal information obtained from point cloud sequences. In this paper, we propose a new 3D object detector, named D-Align, which can effectively produce strong bird's-eye-view (BEV) features by aligning and aggregating the features obtained from a sequence of point sets. The proposed method includes a novel dual-query co-attention network that uses two types of queries, including target query set (T-QS) and support query set (S-QS), to update the features of target and support frames, respectively. D-Align…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junhyung-SPALab/D-Align
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques