Generative Semantic Communication via Textual Prompts: Latency   Performance Tradeoffs

Mengmeng Ren; Li Qiao; Long Yang; Zhen Gao; Jian Chen; Mahdi Boloursaz; Mashhadi; Pei Xiao; Rahim Tafazolli; and Mehdi Bennis

arXiv:2409.09715·cs.IT·May 5, 2025

Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

Mengmeng Ren, Li Qiao, Long Yang, Zhen Gao, Jian Chen, Mahdi Boloursaz, Mashhadi, Pei Xiao, Rahim Tafazolli, and Mehdi Bennis

PDF

Open Access

TL;DR

This paper introduces a collaborative semantic communication framework using pre-trained multi-modal models to optimize visual prompt generation and transmission, achieving low latency and high semantic fidelity in wireless edge devices.

Contribution

It develops a novel multi-user generative semantic communication framework with joint optimization of prompt offloading, resource allocation, and latency reduction.

Findings

01

Significant latency reduction compared to benchmarks.

02

Enhanced semantic quality through optimized prompt generation.

03

Effective multi-user collaboration with low complexity algorithms.

Abstract

This paper develops an edge-device collaborative Generative Semantic Communications (Gen SemCom) framework leveraging pre-trained Multi-modal/Vision Language Models (M/VLMs) for ultra-low-rate semantic communication via textual prompts. The proposed framework optimizes the use of M/VLMs on the wireless edge/device to generate high-fidelity textual prompts through visual captioning/question answering, which are then transmitted over a wireless channel for SemCom. Specifically, we develop a multi-user Gen SemCom framework using pre-trained M/VLMs, and formulate a joint optimization problem of prompt generation offloading, communication and computation resource allocation to minimize the latency and maximize the resulting semantic quality. Due to the nonconvex nature of the problem with highly coupled discrete and continuous variables, we decompose it as a two-level problem and propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems