Soulstyler: Using Large Language Model to Guide Image Style Transfer for   Target Object

Junhao Chen; Peng Rong; Jingbo Sun; Chao Li; Xiang Li; Hongwu Lv

arXiv:2311.13562·cs.CV·November 30, 2023·1 cites

Soulstyler: Using Large Language Model to Guide Image Style Transfer for Target Object

Junhao Chen, Peng Rong, Jingbo Sun, Chao Li, Xiang Li, Hongwu Lv

PDF

Open Access 1 Repo

TL;DR

Soulstyler enables targeted image style transfer guided by natural language descriptions, allowing precise stylization of specific objects without altering background regions, advancing flexible and user-controlled style transfer techniques.

Contribution

Introduces a novel framework combining large language models and CLIP-based embeddings for text-guided localized style transfer on specific image objects.

Findings

01

Accurately stylizes target objects based on textual descriptions

02

Maintains original style of non-target regions

03

Demonstrates effectiveness through experimental results

Abstract

Image style transfer occupies an important place in both computer graphics and computer vision. However, most current methods require reference to stylized images and cannot individually stylize specific objects. To overcome this limitation, we propose the "Soulstyler" framework, which allows users to guide the stylization of specific objects in an image through simple textual descriptions. We introduce a large language model to parse the text and identify stylization goals and specific styles. Combined with a CLIP-based semantic visual embedding encoder, the model understands and matches text and image content. We also introduce a novel localized text-image block matching loss that ensures that style transfer is performed only on specified target objects, while non-target regions remain in their original style. Experimental results demonstrate that our model is able to accurately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yisuanwang/soulstyler
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Motion and Animation · Face recognition and analysis