Text-Guided Mask-free Local Image Retouching

Zerun Liu; Fan Zhang; Jingxuan He; Jin Wang; Zhangye Wang; Lechao; Cheng

arXiv:2212.07603·cs.CV·February 27, 2023

Text-Guided Mask-free Local Image Retouching

Zerun Liu, Fan Zhang, Jingxuan He, Jin Wang, Zhangye Wang, Lechao, Cheng

PDF

Open Access

TL;DR

This paper introduces a novel text-guided, mask-free image retouching method that generates plausible masks based on text descriptions, enabling high-quality retouching without object-level supervision.

Contribution

It presents a new approach for image retouching that does not require mask supervision, broadening the applicability of deep learning in this domain.

Findings

01

Produces high-quality, accurate retouched images from spoken language

02

Constructs plausible, edge-sharp masks based on text descriptions

03

Demonstrates effectiveness through extensive experiments

Abstract

In the realm of multi-modality, text-guided image retouching techniques emerged with the advent of deep learning. Most currently available text-guided methods, however, rely on object-level supervision to constrain the region that may be modified. This not only makes it more challenging to develop these algorithms, but it also limits how widely deep learning can be used for image retouching. In this paper, we offer a text-guided mask-free image retouching approach that yields consistent results to address this concern. In order to perform image retouching without mask supervision, our technique can construct plausible and edge-sharp masks based on the text for each object in the image. Extensive experiments have shown that our method can produce high-quality, accurate images based on spoken language. The source code will be released soon.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection