Large AI Model Empowered Multimodal Semantic Communications
Feibo Jiang, Li Dong, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan,, Xiaohu You

TL;DR
This paper introduces a novel framework leveraging large AI models to enhance multimodal semantic communications by addressing data heterogeneity, semantic ambiguity, and channel distortion, resulting in improved transmission quality.
Contribution
The paper proposes a comprehensive LAM-MSC framework combining MLM-based multimodal alignment, personalized LLM-based knowledge base, and GAN-based channel estimation for the first time.
Findings
Superior performance demonstrated through simulations
Effective mitigation of fading channel effects
Enhanced semantic consistency across modalities
Abstract
Multimodal signals, including text, audio, image, and video, can be integrated into Semantic Communication (SC) systems to provide an immersive experience with low latency and high quality at the semantic level. However, the multimodal SC has several challenges, including data heterogeneity, semantic ambiguity, and signal distortion during transmission. Recent advancements in large AI models, particularly in the Multimodal Language Model (MLM) and Large Language Model (LLM), offer potential solutions for addressing these issues. To this end, we propose a Large AI Model-based Multimodal SC (LAM-MSC) framework, where we first present the MLM-based Multimodal Alignment (MMA) that utilizes the MLM to enable the transformation between multimodal and unimodal data while preserving semantic consistency. Then, a personalized LLM-based Knowledge Base (LKB) is proposed, which allows users to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · AI in cancer detection · Advanced Image and Video Retrieval Techniques
MethodsBalanced Selection
