Multimodal Wireless Foundation Models
Ahmed Aboulfotouh, Hatem Abou-Zeid

TL;DR
This paper introduces the first multimodal wireless foundation model that processes both raw IQ streams and image-like wireless data, enabling diverse wireless tasks and outperforming single-modality models in several scenarios.
Contribution
The work presents a novel multimodal wireless foundation model with a self-supervised training method, capable of handling multiple data types and tasks in wireless communication and sensing.
Findings
Competitive with single-modality models
Surpasses single-modality models in several tasks
Supports diverse wireless applications
Abstract
Wireless foundation models (WFMs) have recently demonstrated promising capabilities, jointly performing multiple wireless functions and adapting effectively to new environments. However, while current WFMs process only one modality, depending on the task and operating conditions, the most informative modality changes and no single modality is best for all tasks. WFMs should therefore be designed to accept multiple modalities to enable a broader and more diverse range of tasks and scenarios. In this work, we propose and build the first multimodal wireless foundation model capable of processing both raw IQ streams and image-like wireless modalities (e.g., spectrograms and CSI) and performing multiple tasks across both. We introduce masked wireless modeling for the multimodal setting, a self-supervised objective and pretraining recipe that learns a joint representation from IQ streams and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndoor and Outdoor Localization Technologies · Advanced Wireless Communication Technologies · Wireless Signal Modulation Classification
