Agentic Planning with Reasoning for Image Styling via Offline RL
Subhojyoti Mukherjee, Stefano Petrangeli, Branislav Kveton, Trung Bui, Franck Dernoncourt, Arko Mukherjee

TL;DR
This paper introduces a structured agentic planning framework with reasoning for image styling, leveraging offline reinforcement learning and compositional tools to improve complex image editing tasks beyond prompt-based methods.
Contribution
It proposes a novel tool-based planning methodology with explicit reasoning, new datasets with reasoning chains, and offline RL training methods that enhance image styling performance.
Findings
Outperforms baseline methods in visual quality and instruction following.
Demonstrates effectiveness on large-scale models with human validation.
Provides publicly available datasets and code for further research.
Abstract
Direct prompt-based editing often fails on complex transformations because vague and subjective prompts often require nuanced understanding of what should be changed in the image. Our core intuition is that leveraging compositional image editing tools rather than direct prompting profits from structured agent-level planning with explicit reasoning, leading to better results. This structured planning framework enables efficient offline RL post-training on quality-scored trajectories to improve performance. We present a tool-based agentic RL post-training framework that addresses this through structured planning with chain-of-thought reasoning. Our key contributions include: (1) A tool-based agentic planning methodology that combines a compositional library of orthogonal primitive transformations, structured context representation, and explicit per-step reasoning to decompose complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques
