Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences

Yusong Zhang; Yuxuan Sun; Lei Guo; Wei Chen; Bo Ai; Deniz Gunduz

arXiv:2507.04621·cs.LG·July 8, 2025

Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences

Yusong Zhang, Yuxuan Sun, Lei Guo, Wei Chen, Bo Ai, Deniz Gunduz

PDF

TL;DR

This paper introduces MLLM-SC, a novel semantic communication framework that leverages multimodal large language models for context-aware, task-oriented data transmission in 6G immersive experiences, enhancing efficiency and quality.

Contribution

The paper proposes a device-edge collaborative architecture integrating pre-trained foundation models for semantic analysis, adaptive encoding, and content generation in 6G communications.

Findings

01

Effective semantic guidance improves importance-aware data prioritization.

02

Adaptive bandwidth allocation enhances content quality.

03

Case studies validate improved performance in AR/VR applications.

Abstract

6G networks promise revolutionary immersive communication experiences including augmented reality (AR), virtual reality (VR), and holographic communications. These applications demand high-dimensional multimodal data transmission and intelligent data processing in real-time, which is extremely challenging over resource-limited wireless communication systems. Moreover, a joint understanding of the environment, context, and user intent is essential to deliver task-relevant content effectively. This article presents a novel multimodal large language model (MLLM) integrated semantic communications framework, termed MLLM-SC, which fully leverages reasoning and generative capabilities of pre-trained foundation models for context-aware and task-oriented wireless communication. The MLLM-SC framework adopts a device-edge collaborative architecture. At the edge, MLLM-empowered semantic guidance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.