Training-Free Style and Content Transfer by Leveraging U-Net Skip Connections in Stable Diffusion
Ludovica Schaerf, Andrea Alfarano, Fabrizio Silvestri, Leonardo Impett

TL;DR
This paper investigates the role of U-Net skip connections in diffusion models, revealing their importance in content and style separation, and introduces SkipInject for effective style and content transfer without additional training.
Contribution
The study provides a detailed analysis of U-Net skip connections, highlighting their role in image content and style separation, and proposes SkipInject for training-free style and content transfer.
Findings
Skip connections carry most spatial information for image reconstruction.
Injecting representations from specific skip connections enables effective style transfer.
SkipInject outperforms state-of-the-art methods in content alignment and structural preservation.
Abstract
Recent advances in diffusion models for image generation have led to detailed examinations of several components within the U-Net architecture for image editing. While previous studies have focused on the bottleneck layer (h-space), cross-attention, self-attention, and decoding layers, the overall role of the skip connections of the U-Net itself has not been specifically addressed. We conduct thorough analyses on the role of the skip connections and find that the residual connections passed by the third encoder block carry most of the spatial information of the reconstructed image, splitting the content from the style, passed by the remaining stream in the opposed decoding layer. We show that injecting the representations from this block can be used for text-based editing, precise modifications, and style transfer. We compare our method, SkipInject, to state-of-the-art style transfer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Advanced Data Compression Techniques · Natural Language Processing Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Convolution · Concatenated Skip Connection · U-Net · Diffusion · Focus
