Language-oriented Semantic Communication for Image Transmission with   Fine-Tuned Diffusion Model

Xinfeng Wei; Haonan Tong; Nuocheng Yang; and Changchuan Yin

arXiv:2409.17104·cs.MM·September 26, 2024

Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model

Xinfeng Wei, Haonan Tong, Nuocheng Yang, and Changchuan Yin

PDF

Open Access

TL;DR

This paper introduces a semantic communication framework that transmits descriptive text instead of images, utilizing a fine-tuned diffusion model and transformer-based codec to significantly reduce data volume and enhance robustness over noisy wireless channels.

Contribution

It develops a novel text-2-image generative semantic communication system with a transformer-based codec and personalized diffusion model fine-tuning for efficient, robust image transmission.

Findings

01

Reduces transmitted data volume by up to 99%.

02

Achieves high perceptual quality in image reconstruction.

03

Demonstrates robustness to wireless channel noise.

Abstract

Ubiquitous image transmission in emerging applications brings huge overheads to limited wireless resources. Since that text has the characteristic of conveying a large amount of information with very little data, the transmission of the descriptive text of an image can reduce the amount of transmitted data. In this context, this paper develops a novel semantic communication framework based on a text-2-image generative model (Gen-SC). In particular, a transmitter converts the input image to textual modality data. Then the text is transmitted through a noisy channel to the receiver. The receiver then uses the received text to generate images. Additionally, to improve the robustness of text transmission over noisy channels, we designed a transformer-based text transmission codec model. Moreover, we obtained a personalized knowledge base by fine-tuning the diffusion model to meet the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Computing and Networks