CleanStyle: Plug-and-Play Style Conditioning Purification for Text-to-Image Stylization
Xiaoman Feng, Mingkun Lei, Yang Wang, Dingwen Fu, Chi Zhang

TL;DR
CleanStyle is a plug-and-play framework that reduces content leakage in diffusion-based style transfer by filtering style embeddings and using style-specific guidance, leading to more faithful and high-quality stylized images.
Contribution
It introduces a novel SVD-based filtering method and style-specific guidance techniques that improve style transfer fidelity without retraining diffusion models.
Findings
Significantly reduces content leakage in stylized images.
Enhances prompt fidelity and visual quality of stylizations.
Compatible with existing models without retraining.
Abstract
Style transfer in diffusion models enables controllable visual generation by injecting the style of a reference image. However, recent encoder-based methods, while efficient and tuning-free, often suffer from content leakage, where semantic elements from the style image undesirably appear in the output, impairing prompt fidelity and stylistic consistency. In this work, we introduce CleanStyle, a plug-and-play framework that filters out content-related noise from the style embedding without retraining. Motivated by empirical analysis, we observe that such leakage predominantly stems from the tail components of the style embedding, which are isolated via Singular Value Decomposition (SVD). To address this, we propose CleanStyleSVD (CS-SVD), which dynamically suppresses tail components using a time-aware exponential schedule, providing clean, style-preserving conditional embeddings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Humanities and Scholarship · Aesthetic Perception and Analysis
