Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model
Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No,, Ernest K. Ryu

TL;DR
This paper demonstrates that adding LoRA conditioning directly to the attention layers of diffusion models enhances image generation quality without altering the existing architecture.
Contribution
Introducing a simple, drop-in LoRA conditioning method for attention layers that improves diffusion model performance.
Findings
LoRA conditioning improves FID scores on CIFAR-10
No need to modify existing U-Net architecture
Significant quality gains with minimal changes
Abstract
Current state-of-the-art diffusion models employ U-Net architectures containing convolutional and (qkv) self-attention layers. The U-Net processes images while being conditioned on the time embedding input for each sampling step and the class or caption embedding input corresponding to the desired conditional generation. Such conditioning involves scale-and-shift operations to the convolutional layers but does not directly affect the attention layers. While these standard architectural choices are certainly effective, not conditioning the attention layers feels arbitrary and potentially suboptimal. In this work, we show that simply adding LoRA conditioning to the attention layers without changing or tuning the other parts of the U-Net architecture improves the image generation quality. For example, a drop-in addition of LoRA conditioning to EDM diffusion model yields FID scores of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques
MethodsConvolution · Concatenated Skip Connection · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · U-Net · Diffusion
