DeckFlow: Iterative Specification on a Multimodal Generative Canvas

Gregory Croisdale; Emily Huang; John Joon Young Chung; Anhong Guo; Xu Wang; Austin Z. Henley; Cyrus Omar

arXiv:2506.15873·cs.HC·June 23, 2025

DeckFlow: Iterative Specification on a Multimodal Generative Canvas

Gregory Croisdale, Emily Huang, John Joon Young Chung, Anhong Guo, Xu Wang, Austin Z. Henley, Cyrus Omar

PDF

Open Access

TL;DR

DeckFlow is a multimodal generative AI tool that enables iterative task and specification decomposition on an infinite canvas, supporting creative workflows with text, image, and audio generation, and allowing recursive feedback for design refinement.

Contribution

It introduces DeckFlow, a novel multimodal AI platform that addresses key limitations in existing tools by supporting task decomposition, specification refinement, and generative space exploration.

Findings

01

DeckFlow outperforms a conversational AI baseline in text-to-image tasks.

02

Users effectively utilize DeckFlow for open-ended creative projects.

03

DeckFlow facilitates iterative design with multimodal outputs.

Abstract

Generative AI promises to allow people to create high-quality personalized media. Although powerful, we identify three fundamental design problems with existing tooling through a literature review. We introduce a multimodal generative AI tool, DeckFlow, to address these problems. First, DeckFlow supports task decomposition by allowing users to maintain multiple interconnected subtasks on an infinite canvas populated by cards connected through visual dataflow affordances. Second, DeckFlow supports a specification decomposition workflow where an initial goal is iteratively decomposed into smaller parts and combined using feature labels and clusters. Finally, DeckFlow supports generative space exploration by generating multiple prompt and output variations, presented in a grid, that can feed back recursively into the next design iteration. We evaluate DeckFlow for text-to-image generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Music Technology and Sound Studies