Occlusion-Aware Multimodal Beam Prediction and Pose Estimation for mmWave V2I

Abidemi Orimogunje; Hyunwoo Park; Kyeong-Ju Cha; Igbafe Orikumhi; Sunwoo Kim; Dejan Vukobratovic

arXiv:2603.25799·eess.SP·March 30, 2026

Occlusion-Aware Multimodal Beam Prediction and Pose Estimation for mmWave V2I

Abidemi Orimogunje, Hyunwoo Park, Kyeong-Ju Cha, Igbafe Orikumhi, Sunwoo Kim, Dejan Vukobratovic

PDF

TL;DR

This paper introduces an occlusion-aware multimodal learning framework using Transformer networks for improved beam prediction and pose estimation in mmWave V2I systems under dynamic blockage, combining multiple sensor modalities.

Contribution

It presents a novel multimodal fusion approach inspired by SLAM concepts that jointly predicts beam, blockage, and position, outperforming radio-only and camera-only baselines.

Findings

01

Achieves 50.92% Top-1 beam accuracy on 60 GHz dataset.

02

Outperforms radio-only and camera-only baselines in multimodal fusion.

03

Provides accurate 2D position with 1.33m RMSE.

Abstract

We propose an occlusion-aware multimodal learning framework that is inspired by simultaneous localization and mapping (SLAM) concepts for trajectory interpretation and pose prediction. Targeting mmWave vehicle-to-infrastructure (V2I) beam management under dynamic blockage, our Transformer-based fusion network ingests synchronized RGB images, LiDAR point clouds, radar range-angle maps, GNSS, and short-term mmWave power history. It jointly predicts the receive beam index, blockage probability, and 2D position using labels automatically derived from 64-beam sweep power vectors, while an offline LiDAR map enables SLAM-style trajectory visualization. On the 60 GHz DeepSense 6G Scenario 31 dataset, the model achieves 50.92\% Top-1 and 86.50\% Top-3 beam accuracy with 0.018 bits/s/Hz spectral-efficiency loss, 63.35\% blocked-class F1, and 1.33m position RMSE. Multimodal fusion outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.