VeraRetouch: A Lightweight Fully Differentiable Framework for Multi-Task Reasoning Photo Retouching

Yihong Guo; Youwei Lyu; Jiajun Tang; Yizhuo Zhou; Hongliang Wang; Jinwei Chen; Changqing Zou; Qingnan Fan

arXiv:2604.27375·cs.CV·May 21, 2026

VeraRetouch: A Lightweight Fully Differentiable Framework for Multi-Task Reasoning Photo Retouching

Yihong Guo, Youwei Lyu, Jiajun Tang, Yizhuo Zhou, Hongliang Wang, Jinwei Chen, Changqing Zou, Qingnan Fan

PDF

1 Repo

TL;DR

VeraRetouch is a lightweight, fully differentiable framework for multi-task photo retouching that leverages a large vision-language model and a novel dataset, enabling end-to-end training and mobile deployment.

Contribution

The paper introduces VeraRetouch, a fully differentiable multi-task retouching framework with a new large-scale dataset and reinforcement learning strategy, advancing end-to-end photo retouching.

Findings

01

Achieves state-of-the-art performance on multiple benchmarks.

02

Enables mobile deployment with a smaller model footprint.

03

Introduces the first million-scale professional retouching dataset.

Abstract

Reasoning photo retouching has gained significant traction, requiring models to analyze image defects, give reasoning processes, and execute precise retouching enhancements. However, existing approaches often rely on non-differentiable external software, creating optimization barriers and suffering from high parameter redundancy and limited generalization. To address these challenges, we propose VeraRetouch, a lightweight and fully differentiable framework for multi-task photo retouching. We employ a 0.5B Vision-Language Model (VLM) as the central intelligence to formulate retouching plans based on instructions and scene semantics. Furthermore, we develop a fully differentiable Retouch Renderer that replaces external tools, enabling direct end-to-end pixel-level training through decoupled control latents for lighting, global color, and specific color adjustments. To overcome data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

OpenVeraTeam/VeraRetouch
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.