coDrawAgents: A Multi-Agent Dialogue Framework for Compositional Image Generation
Chunhan Li, Qifeng Wu, Jia-Hui Pan, Ka-Hei Hui, Jingyu Hu, Yuming Jiang, Bin Sheng, Xihui Liu, Wenjuan Gong, Zhengzhe Liu

TL;DR
coDrawAgents introduces a multi-agent dialogue framework with specialized roles that collaboratively enhance compositional image generation, significantly improving alignment, spatial accuracy, and attribute fidelity in complex scenes.
Contribution
This work presents a novel multi-agent dialogue system with four specialized agents that collaboratively improve compositional text-to-image generation, addressing layout complexity and error correction.
Findings
Improves text-image alignment and attribute fidelity
Enhances spatial accuracy in generated images
Outperforms existing methods on benchmark datasets
Abstract
Text-to-image generation has advanced rapidly, but existing models still struggle with faithfully composing multiple objects and preserving their attributes in complex scenes. We propose coDrawAgents, an interactive multi-agent dialogue framework with four specialized agents: Interpreter, Planner, Checker, and Painter that collaborate to improve compositional generation. The Interpreter adaptively decides between a direct text-to-image pathway and a layout-aware multi-agent process. In the layout-aware mode, it parses the prompt into attribute-rich object descriptors, ranks them by semantic salience, and groups objects with the same semantic priority level for joint generation. Guided by the Interpreter, the Planner adopts a divide-and-conquer strategy, incrementally proposing layouts for objects with the same semantic priority level while grounding decisions in the evolving visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
