RCDPT: Radar-Camera fusion Dense Prediction Transformer

Chen-Chou Lo; Patrick Vandewalle

arXiv:2211.02432·cs.CV·March 3, 2023

RCDPT: Radar-Camera fusion Dense Prediction Transformer

Chen-Chou Lo, Patrick Vandewalle

PDF

Open Access 1 Repo

TL;DR

This paper introduces RCDPT, a novel radar-camera fusion method using a dense prediction transformer that enhances depth estimation by integrating radar data without relying on readout tokens, outperforming existing models.

Contribution

The paper proposes a new radar-camera fusion strategy for dense prediction transformers that improves depth estimation performance by reassembling camera and radar representations.

Findings

01

Outperforms existing convolutional depth estimation models with radar integration.

02

Better fusion strategy than common approaches on nuScenes dataset.

03

Enhances monocular depth estimation accuracy using radar data.

Abstract

Recently, transformer networks have outperformed traditional deep neural networks in natural language processing and show a large potential in many computer vision tasks compared to convolutional backbones. In the original transformer, readout tokens are used as designated vectors for aggregating information from other tokens. However, the performance of using readout tokens in a vision transformer is limited. Therefore, we propose a novel fusion strategy to integrate radar data into a dense prediction transformer network by reassembling camera representations with radar representations. Instead of using readout tokens, radar representations contribute additional depth information to a monocular depth estimation model and improve performance. We further investigate different fusion approaches that are commonly used for integrating additional modality in a dense prediction transformer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lochenchou/rcdpt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Image Processing Techniques and Applications · Advanced Optical Sensing Technologies

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Layer Normalization · Residual Connection · Vision Transformer