ImageEdit-R1: Boosting Multi-Agent Image Editing via Reinforcement Learning
Yiran Zhao, Yaoqi Ye, Xiang Liu, Michael Qizhe Shieh, Trung Bui

TL;DR
ImageEdit-R1 introduces a multi-agent reinforcement learning framework for complex, context-aware image editing, significantly improving performance over existing models by coordinating specialized agents for nuanced edits.
Contribution
This work presents a novel multi-agent reinforcement learning approach that enables dynamic, goal-oriented image editing through coordinated decision-making among specialized agents.
Findings
Outperforms existing closed-source diffusion models
Achieves better results on multiple image editing datasets
Demonstrates effective multi-agent collaboration in editing tasks
Abstract
With the rapid advancement of commercial multi-modal models, image editing has garnered significant attention due to its widespread applicability in daily life. Despite impressive progress, existing image editing systems, particularly closed-source or proprietary models, often struggle with complex, indirect, or multi-step user instructions. These limitations hinder their ability to perform nuanced, context-aware edits that align with human intent. In this work, we propose ImageEdit-R1, a multi-agent framework for intelligent image editing that leverages reinforcement learning to coordinate high-level decision-making across a set of specialized, pretrained vision-language and generative agents. Each agent is responsible for distinct capabilities--such as understanding user intent, identifying regions of interest, selecting appropriate editing actions, and synthesizing visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Cell Image Analysis Techniques
