Text-to-Image Synthesis Based on Machine Generated Captions
Marco Menardi, Alex Falcon, Saida S.Mohamed, Lorenzo Seidenari,, Giuseppe Serra, Alberto Del Bimbo, Carlo Tasso

TL;DR
This paper presents a method for text-to-image synthesis that leverages uncaptioned image datasets by generating captions with an image captioning module and then training a conditional GAN on these captions and images.
Contribution
It introduces a novel approach that combines image captioning with GAN training to enable text-based image generation from uncaptioned datasets.
Findings
Preliminary results show promising image generation quality.
The approach effectively uses uncaptioned datasets for text-to-image synthesis.
Comparison with unconditional GAN demonstrates the potential of the method.
Abstract
Text to Image Synthesis refers to the process of automatic generation of a photo-realistic image starting from a given text and is revolutionizing many real-world applications. In order to perform such process it is necessary to exploit datasets containing captioned images, meaning that each image is associated with one (or more) captions describing it. Despite the abundance of uncaptioned images datasets, the number of captioned datasets is limited. To address this issue, in this paper we propose an approach capable of generating images starting from a given text using conditional GANs trained on uncaptioned images dataset. In particular, uncaptioned images are fed to an Image Captioning Module to generate the descriptions. Then, the GAN Module is trained on both the input image and the machine-generated caption. To evaluate the results, the performance of our solution is compared with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · Dogecoin Customer Service Number +1-833-534-1729
