RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential   Decoder for 3D Object Detection

Yiheng Li; Yang Yang; Zhen Lei

arXiv:2412.12799·cs.CV·December 18, 2024

RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection

Yiheng Li, Yang Yang, Zhen Lei

PDF

Open Access 1 Repo 1 Video

TL;DR

RCTrans introduces a novel radar-camera transformer that enhances 3D object detection by densifying radar data, fusing it effectively with camera information, and employing a sequential decoder for precise localization, achieving state-of-the-art results.

Contribution

The paper presents a new query-based detection framework with a radar densifier and sequential decoder, improving radar-camera fusion and 3D detection accuracy.

Findings

01

Achieves new state-of-the-art results on nuScenes dataset.

02

Effectively reduces interference from sparse radar data.

03

Improves 3D object localization accuracy.

Abstract

In radar-camera 3D object detection, the radar point clouds are sparse and noisy, which causes difficulties in fusing camera and radar modalities. To solve this, we introduce a novel query-based detection method named Radar-Camera Transformer (RCTrans). Specifically, we first design a Radar Dense Encoder to enrich the sparse valid radar tokens, and then concatenate them with the image tokens. By doing this, we can fully explore the 3D information of each interest region and reduce the interference of empty tokens during the fusing stage. We then design a Pruning Sequential Decoder to predict 3D boxes based on the obtained tokens and random initialized queries. To alleviate the effect of elevation ambiguity in radar point clouds, we gradually locate the position of the object via a sequential fusion structure. It helps to get more precise and flexible correspondences between tokens and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liyih/rctrans
pytorchOfficial

Videos

RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection· underline

Taxonomy

TopicsAdvanced SAR Imaging Techniques · Infrared Target Detection Methodologies · Optical Systems and Laser Technology

MethodsLinear Layer · Dropout · Attention Is All You Need · Dense Connections · Byte Pair Encoding · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Pruning