Sequential Semantic Generative Communication for Progressive   Text-to-Image Generation

Hyelin Nam; Jihong Park; Jinho Choi; Seong-Lyun Kim

arXiv:2309.04287·eess.SP·September 11, 2023

Sequential Semantic Generative Communication for Progressive Text-to-Image Generation

Hyelin Nam, Jihong Park, Jinho Choi, Seong-Lyun Kim

PDF

Open Access

TL;DR

This paper introduces a novel communication framework that uses multi-modal generative models to transmit images as text prompts, enabling efficient, progressive image reconstruction through sequential word transmission.

Contribution

It proposes a new system that converts images to text and transmits words sequentially based on information priority, enhancing communication efficiency with generative models.

Findings

01

Sequential word transmission improves communication efficiency.

02

The system effectively reconstructs images from text prompts.

03

Utilizes multi-modal generative models for real-world communication applications.

Abstract

This paper proposes new framework of communication system leveraging promising generation capabilities of multi-modal generative models. Regarding nowadays smart applications, successful communication can be made by conveying the perceptual meaning, which we set as text prompt. Text serves as a suitable semantic representation of image data as it has evolved to instruct an image or generate image through multi-modal techniques, by being interpreted in a manner similar to human cognition. Utilizing text can also reduce the overload compared to transmitting the intact data itself. The transmitter converts objective image to text through multi-model generation process and the receiver reconstructs the image using reverse process. Each word in the text sentence has each syntactic role, responsible for particular piece of information the text contains. For further efficiency in communication…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques

MethodsFocus