A Taxonomy of Prompt Modifiers for Text-To-Image Generation
Jonas Oppenlaender

TL;DR
This paper presents a new taxonomy of six prompt modifier types used in text-to-image generation, based on ethnographic research, to aid understanding and improve AI art creation.
Contribution
It introduces a novel taxonomy of prompt modifiers derived from ethnographic study, providing a conceptual framework for research and practice in prompt engineering for text-to-image models.
Findings
Identified six types of prompt modifiers used in practice.
Provided insights into prompt engineering techniques.
Discussed implications for Human-Computer Interaction and Human-AI Interaction.
Abstract
Text-to-image generation has seen an explosion of interest since 2021. Today, beautiful and intriguing digital images and artworks can be synthesized from textual inputs ("prompts") with deep generative models. Online communities around text-to-image generation and AI generated art have quickly emerged. This paper identifies six types of prompt modifiers used by practitioners in the online community based on a 3-month ethnographic study. The novel taxonomy of prompt modifiers provides researchers a conceptual starting point for investigating the practice of text-to-image generation, but may also help practitioners of AI generated art improve their images. We further outline how prompt modifiers are applied in the practice of "prompt engineering." We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction (HCI). The paper concludes with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
