T2CI-GAN: Text to Compressed Image generation using Generative Adversarial Network
Bulla Rajesh, Nandakishore Dusa, Mohammed Javed, Shiv Ram, Dubey, P. Nagabhushan

TL;DR
This paper introduces T2CI-GAN, a novel approach that generates compressed visual data directly from text descriptions using GANs, improving efficiency by operating in the compressed domain.
Contribution
The work presents the first GAN models capable of generating JPEG compressed images directly from text, enhancing storage and computational efficiency in visual data synthesis.
Findings
Achieved state-of-the-art results in JPEG compressed domain on Oxford-102 dataset.
Demonstrated effective generation of compressed images from text descriptions.
Validated models on both RGB and JPEG compressed data.
Abstract
The problem of generating textual descriptions for the visual data has gained research attention in the recent years. In contrast to that the problem of generating visual data from textual descriptions is still very challenging, because it requires the combination of both Natural Language Processing (NLP) and Computer Vision techniques. The existing methods utilize the Generative Adversarial Networks (GANs) and generate the uncompressed images from textual description. However, in practice, most of the visual data are processed and transmitted in the compressed representation. Hence, the proposed work attempts to generate the visual data directly in the compressed representation form using Deep Convolutional GANs (DCGANs) to achieve the storage and computational efficiency. We propose GAN models for compressed image generation from text. The first model is directly trained with JPEG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Video Analysis and Summarization
