JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Yunlong Lin; Zixu Lin; Kunjie Lin; Jinbin Bai; Panwang Pan; Chenxin Li; Haoyu Chen; Zhongdao Wang; Xinghao Ding; Wenbo Li; Shuicheng Yan

arXiv:2506.17612·cs.CV·June 24, 2025

JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Yunlong Lin, Zixu Lin, Kunjie Lin, Jinbin Bai, Panwang Pan, Chenxin Li, Haoyu Chen, Zhongdao Wang, Xinghao Ding, Wenbo Li, Shuicheng Yan

PDF

3 Models 3 Datasets

TL;DR

JarvisArt is an AI-powered photo retouching agent that understands user intent, mimics professional artists, and intelligently manages over 200 tools within Lightroom, achieving superior customization and quality in image editing.

Contribution

It introduces a multi-modal large language model-driven agent with a novel training process and seamless Lightroom integration, advancing automated, personalized photo retouching.

Findings

01

Outperforms GPT-4o with 60% better pixel-level metrics

02

Demonstrates superior generalization and fine-grained control

03

Achieves user-friendly interaction and high content fidelity

Abstract

Photo retouching has become integral to contemporary visual storytelling, enabling users to capture aesthetics and express creativity. While professional tools such as Adobe Lightroom offer powerful capabilities, they demand substantial expertise and manual effort. In contrast, existing AI-based solutions provide automation but often suffer from limited adjustability and poor generalization, failing to meet diverse and personalized editing needs. To bridge this gap, we introduce JarvisArt, a multi-modal large language model (MLLM)-driven agent that understands user intent, mimics the reasoning process of professional artists, and intelligently coordinates over 200 retouching tools within Lightroom. JarvisArt undergoes a two-stage training process: an initial Chain-of-Thought supervised fine-tuning to establish basic reasoning and tool-use skills, followed by Group Relative Policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.