Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences
Yusong Zhang, Yuxuan Sun, Lei Guo, Wei Chen, Bo Ai, Deniz Gunduz

TL;DR
This paper introduces MLLM-SC, a novel semantic communication framework that leverages multimodal large language models for context-aware, task-oriented data transmission in 6G immersive experiences, enhancing efficiency and quality.
Contribution
The paper proposes a device-edge collaborative architecture integrating pre-trained foundation models for semantic analysis, adaptive encoding, and content generation in 6G communications.
Findings
Effective semantic guidance improves importance-aware data prioritization.
Adaptive bandwidth allocation enhances content quality.
Case studies validate improved performance in AR/VR applications.
Abstract
6G networks promise revolutionary immersive communication experiences including augmented reality (AR), virtual reality (VR), and holographic communications. These applications demand high-dimensional multimodal data transmission and intelligent data processing in real-time, which is extremely challenging over resource-limited wireless communication systems. Moreover, a joint understanding of the environment, context, and user intent is essential to deliver task-relevant content effectively. This article presents a novel multimodal large language model (MLLM) integrated semantic communications framework, termed MLLM-SC, which fully leverages reasoning and generative capabilities of pre-trained foundation models for context-aware and task-oriented wireless communication. The MLLM-SC framework adopts a device-edge collaborative architecture. At the edge, MLLM-empowered semantic guidance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
