Controlling Text-to-Image Diffusion by Orthogonal Finetuning
Zeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu,, Dan Zhang, Adrian Weller, Bernhard Sch\"olkopf

TL;DR
This paper introduces Orthogonal Finetuning (OFT), a novel method for adapting text-to-image diffusion models to downstream tasks while preserving their semantic generation ability, with improved stability and performance.
Contribution
The paper proposes OFT and COFT, new finetuning techniques that maintain model properties and enhance stability, outperforming existing methods in text-to-image tasks.
Findings
OFT preserves hyperspherical energy, maintaining semantic generation.
OFT outperforms existing methods in quality and convergence speed.
COFT adds radius constraints for improved finetuning stability.
Abstract
Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important open problem. To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to downstream tasks. Unlike existing methods, OFT can provably preserve hyperspherical energy which characterizes the pairwise neuron relationship on the unit hypersphere. We find that this property is crucial for preserving the semantic generation ability of text-to-image diffusion models. To improve finetuning stability, we further propose Constrained Orthogonal Finetuning (COFT) which imposes an additional radius constraint to the hypersphere. Specifically, we consider two important finetuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
