Leveraging Transformer Decoder for Automotive Radar Object Detection
Changxu Zhang, Zhaoze Wang, Tai Fei, Christopher Grimm, Yi Jin, Claas Tebruegge, Ernst Warsitz, Markus Gardill

TL;DR
This paper introduces a Transformer-based architecture with a novel decoder and Pyramid Token Fusion for direct 3D radar object detection, achieving state-of-the-art results without heuristic post-processing.
Contribution
It presents a new Transformer decoder and Pyramid Token Fusion module for end-to-end radar object detection, reducing reliance on traditional proposal and NMS methods.
Findings
Significant performance improvements over baseline methods
Effective modeling of long-range spatial-temporal correlations
Elimination of dense proposal generation and heuristic NMS
Abstract
In this paper, we present a Transformer-based architecture for 3D radar object detection that uses a novel Transformer Decoder as the prediction head to directly regress 3D bounding boxes and class scores from radar feature representations. To bridge multi-scale radar features and the decoder, we propose Pyramid Token Fusion (PTF), a lightweight module that converts a feature pyramid into a unified, scale-aware token sequence. By formulating detection as a set prediction problem with learnable object queries and positional encodings, our design models long-range spatial-temporal correlations and cross-feature interactions. This approach eliminates dense proposal generation and heuristic post-processing such as extensive non-maximum suppression (NMS) tuning. We evaluate the proposed framework on the RADDet, where it achieves significant improvements over state-of-the-art radar-only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced SAR Imaging Techniques · Radar Systems and Signal Processing · Advanced Neural Network Applications
