Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation
Zoe De Simone, Angie Boggust, Fredo Durand, Ashia Wilson, Arvind Satyanarayan

TL;DR
Creo is a multi-stage text-to-image system that allows incremental, controllable image creation from sketches to high-res images, enhancing user agency and diversity.
Contribution
This work introduces a novel multi-stage T2I system with intermediate abstractions and decision locking, improving controllability and user involvement.
Findings
Participants felt stronger ownership over Creo outputs.
Creo outputs are less homogeneous than one-shot results.
Multi-stage generation improves controllability and diversity.
Abstract
Text-to-image (T2I) systems enable rapid generation of high-fidelity imagery but are misaligned with how visual ideas develop. T2I systems generate outputs that make implicit visual decisions on behalf of the user, often introduce fine-grained details that can anchor users prematurely and limit their ability to keep options open early on, and cause unintended changes during editing that are difficult to correct and reduce users' sense of control. To address these concerns, we present Creo, a multi-stage T2I system that scaffolds image generation by progressing from rough sketches to high-resolution outputs, exposing intermediary abstractions where users can make incremental changes. Sketch-like abstractions invite user editing and allow users to keep design options open when ideas are still forming due to their provisional nature. Each stage in Creo can be modified with manual changes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
