Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

Zijian Zheng; Wenqiang Yi; Hyundong Shin; Arumugam Nallanathan

arXiv:2605.14650·eess.SP·May 15, 2026

Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

Zijian Zheng, Wenqiang Yi, Hyundong Shin, Arumugam Nallanathan

PDF

TL;DR

This paper introduces a variational inference framework for multimodal learning to improve MIMO beam prediction accuracy, especially when limited multimodal data is available.

Contribution

The proposed two-stage training strategy enhances data efficiency and robustness in multimodal beam prediction by decoupling feature extraction and semantic alignment.

Findings

01

Achieves competitive beam prediction accuracy with only 20% of multimodal training data.

02

Enhances data efficiency and robustness under sensing uncertainties.

03

Demonstrates effectiveness on the DeepSense6G dataset.

Abstract

Accurate beam prediction is essential for mitigating signalling overhead and latency in integrated sensing and communication-enabled massive multi-input multi-output systems. With the aid of multimodal learning, the prediction accuracy can be enhanced by leveraging the complementary information from other existing sensors, but the practical deployment is often constrained by the high cost of acquiring semantically aligned multimodal datasets. This paper proposes a variational-inference-based multimodal framework that decouples the optimization problem into modular feature extraction and cross-modal semantic alignment. Specifically, we develop a two-stage training strategy where the model utilises abundant unimodal data for representation learning before performing refined alignment on limited multimodal samples. This design enhances data efficiency and ensures robust feature fusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.