Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models
Lin Zhu, Xinbing Wang, Chenghu Zhou, Qinying Gu, Nanyang Ye

TL;DR
This paper introduces a masking technique to better disentangle content and style in diffusion-based image generation, improving style transfer quality by reducing content leakage.
Contribution
The paper proposes a simple, parameter-free masking method that effectively separates content from style in diffusion models, enhancing style transfer performance.
Findings
Masking specific image features reduces content leakage.
Fewer conditions guided by masking improve style transfer.
Theoretical and experimental validation supports the approach.
Abstract
Given a style-reference image as the additional image condition, text-to-image diffusion models have demonstrated impressive capabilities in generating images that possess the content of text prompts while adopting the visual style of the reference image. However, current state-of-the-art methods often struggle to disentangle content and style from style-reference images, leading to issues such as content leakages. To address this issue, we propose a masking-based method that efficiently decouples content from style without the need of tuning any model parameters. By simply masking specific elements in the style reference's image features, we uncover a critical yet under-explored principle: guiding with appropriately-selected fewer conditions (e.g., dropping several image feature elements) can efficiently avoid unwanted content flowing into the diffusion models, enhancing the style…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Digital Humanities and Scholarship · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
