Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models
Sofia Jamil, Bollampalli Areen Reddy, Raghvendra Kumar, Sriparna Saha,, K J Joseph, Koustava Goswami

TL;DR
This paper introduces PoemToPixel, a novel framework that uses prompt tuning and a new algorithm to generate images that visually interpret poems, supported by a new dataset of children's poems and images.
Contribution
It presents the PoemToPixel framework with the PoeKey algorithm for extracting key poetic elements to improve image generation from poetry.
Findings
Effective alignment of generated images with poetic content
Introduction of MiniPo dataset for diverse poetry-image pairs
Quantitative and qualitative validation of the framework
Abstract
The task of text-to-image generation has encountered significant challenges when applied to literary works, especially poetry. Poems are a distinct form of literature, with meanings that frequently transcend beyond the literal words. To address this shortcoming, we propose a PoemToPixel framework designed to generate images that visually represent the inherent meanings of poems. Our approach incorporates the concept of prompt tuning in our image generation framework to ensure that the resulting images closely align with the poetic content. In addition, we propose the PoeKey algorithm, which extracts three key elements in the form of emotions, visual elements, and themes from poems to form instructions which are subsequently provided to a diffusion model for generating corresponding images. Furthermore, to expand the diversity of the poetry dataset across different genres and ages, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion · ALIGN
