DialogPaint: A Dialog-based Image Editing Model
Jingxuan Wei, Shiyu Wu, Xin Jiang, Yequan Wang

TL;DR
DialogPaint is an innovative framework that combines conversational dialogue with image editing, allowing users to intuitively modify images through natural language interactions, supporting iterative and multi-round edits.
Contribution
It introduces a novel integration of dialogue models with Stable Diffusion for interactive image editing, handling explicit and ambiguous instructions in a multi-round setting.
Findings
Effective interpretation of complex instructions
Supports iterative, multi-round editing
Demonstrates robustness and versatility
Abstract
We introduce DialogPaint, a novel framework that bridges conversational interactions with image editing, enabling users to modify images through natural dialogue. By integrating a dialogue model with the Stable Diffusion image transformation technique, DialogPaint offers a more intuitive and interactive approach to image modifications. Our method stands out by effectively interpreting and executing both explicit and ambiguous instructions, handling tasks such as object replacement, style transfer, and color modification. Notably, DialogPaint supports iterative, multi-round editing, allowing users to refine image edits over successive interactions. Comprehensive evaluations highlight the robustness and versatility of our approach, marking a significant advancement in dialogue-driven image editing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Virtual Reality Applications and Impacts · Video Analysis and Summarization
MethodsDiffusion
