Sensing and Understanding the World over Air: A Large Multimodal Model for Mobile Networks
Zhuoran Duan, Yuhao Wei, Guoshun Nan, Zijun Wang, Yan Yan, Lihua Xiong, Yuhan Ran, Ji Zhang, Jian Li, Qimei Cui, Xiaofeng Tao, Tony Q. S. Quek

TL;DR
This paper introduces a large multimodal model tailored for wireless networks that leverages wireless signals as a universal modality, enabling better sensing and understanding of the physical world for smart services.
Contribution
It proposes a novel wireless-native multimodal training paradigm and constructs a GPT-style model trained on real-world data, demonstrating superior performance over existing models.
Findings
Wireless signals can serve as a universal modality for multimodal learning.
The proposed WMLM outperforms existing small-scale and multi-modal models.
The approach validates the feasibility of using wireless signals for large-scale multimodal models.
Abstract
Large models (LMs), such as ChatGPT, have made a significant impact across diverse domains and hold great potential to facilitate the evolution of network intelligence. Wireless-native multi-modal large models (WMLMs) can sense and understand the physical world through multi-modal data, serving as a key enabler that integrates communication, sensing, and intelligence, and thus they can boost various smart services to billions of users. However, research on WMLMs remains in its infancy, and the construction of domain-specific multi-modal large models for wireless networks is still underexplored. In this paper, we outlines the key characteristics of WMLMs and summarizes existing methods, on the basis of which a wireless-native multimodal training paradigm is proposed. Specifically, we constructed a GPT-style WMLM model and trained it on a real-world large-scale dataset, leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndoor and Outdoor Localization Technologies · Speech and Audio Processing · Underwater Vehicles and Communication Systems
