Cross-Modal Generative Semantic Communications for Mobile AIGC: Joint Semantic Encoding and Prompt Engineering
Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong,, Shiwen Mao, Ping Zhang, Xuemin Shen

TL;DR
This paper introduces a cross-modal semantic communication framework for mobile AI-generated content that significantly reduces bandwidth usage by transmitting only essential semantic information, while maintaining high output quality.
Contribution
It proposes a joint semantic encoding and prompt engineering approach, along with a novel attention-aware diffusion algorithm, to optimize bandwidth and enhance content recovery in mobile AIGC services.
Findings
Bandwidth consumption reduced by 49.4% on average.
ADD algorithm outperforms baseline methods with 1.74x higher reward.
High perceptual quality maintained despite reduced data transmission.
Abstract
Employing massive Mobile AI-Generated Content (AIGC) Service Providers (MASPs) with powerful models, high-quality AIGC services can become accessible for resource-constrained end users. However, this advancement, referred to as mobile AIGC, also introduces a significant challenge: users should download large AIGC outputs from the MASPs, leading to substantial bandwidth consumption and potential transmission failures. In this paper, we apply cross-modal Generative Semantic Communications (G-SemCom) in mobile AIGC to overcome wireless bandwidth constraints. Specifically, we utilize a series of cross-modal attention maps to indicate the correlation between user prompts and each part of AIGC outputs. In this way, the MASP can analyze the prompt context and filter the most semantically important content efficiently. Only semantic information is transmitted, with which users can recover the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · IoT and Edge/Fog Computing · DNA and Biological Computing
