Text-Guided Mask-free Local Image Retouching
Zerun Liu, Fan Zhang, Jingxuan He, Jin Wang, Zhangye Wang, Lechao, Cheng

TL;DR
This paper introduces a novel text-guided, mask-free image retouching method that generates plausible masks based on text descriptions, enabling high-quality retouching without object-level supervision.
Contribution
It presents a new approach for image retouching that does not require mask supervision, broadening the applicability of deep learning in this domain.
Findings
Produces high-quality, accurate retouched images from spoken language
Constructs plausible, edge-sharp masks based on text descriptions
Demonstrates effectiveness through extensive experiments
Abstract
In the realm of multi-modality, text-guided image retouching techniques emerged with the advent of deep learning. Most currently available text-guided methods, however, rely on object-level supervision to constrain the region that may be modified. This not only makes it more challenging to develop these algorithms, but it also limits how widely deep learning can be used for image retouching. In this paper, we offer a text-guided mask-free image retouching approach that yields consistent results to address this concern. In order to perform image retouching without mask supervision, our technique can construct plausible and edge-sharp masks based on the text for each object in the image. Extensive experiments have shown that our method can produce high-quality, accurate images based on spoken language. The source code will be released soon.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection
