StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Wen Li, Muyuan Fang, Cheng Zou, Biao Gong, Ruobing Zheng, Meng Wang,, Jingdong Chen, and Ming Yang

TL;DR
StyleTokenizer introduces a novel zero-shot method for controlling image style in diffusion models by aligning style representations with text, using a new dataset and a style tokenizer to accurately capture style from a single image.
Contribution
The paper presents StyleTokenizer, a new approach that aligns style and text representations, enabling effective style control from a single image without compromising text prompt effectiveness.
Findings
Accurately captures style from a single reference image.
Generates images consistent with style and text prompts.
Outperforms existing methods in style control quality.
Abstract
Despite the burst of innovative methods for controlling the diffusion process, effectively controlling image styles in text-to-image generation remains a challenging task. Many adapter-based methods impose image representation conditions on the denoising process to accomplish image control. However these conditions are not aligned with the word embedding space, leading to interference between image and text control conditions and the potential loss of semantic information from the text prompt. Addressing this issue involves two key challenges. Firstly, how to inject the style representation without compromising the effectiveness of text representation in control. Secondly, how to obtain the accurate style representation from a single reference image. To tackle these challenges, we introduce StyleTokenizer, a zero-shot style control image generation method that aligns style…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
