TL;DR
DogFit introduces a domain-guided fine-tuning method for diffusion models that improves transfer learning efficiency and controllability without extra computational costs, outperforming prior guidance techniques.
Contribution
The paper proposes DogFit, a novel guidance mechanism that internalizes domain-aware guidance during fine-tuning, enabling efficient and controllable transfer learning of diffusion models.
Findings
DogFit outperforms prior guidance methods in FID and FDDINOV2 metrics.
It requires up to 2x fewer sampling TFLOPS.
It improves generation quality and training stability across diverse domains.
Abstract
Transfer learning of diffusion models to smaller target domains is challenging, as naively fine-tuning the model often results in poor generalization. Test-time guidance methods help mitigate this by offering controllable improvements in image fidelity through a trade-off with sample diversity. However, this benefit comes at a high computational cost, typically requiring dual forward passes during sampling. We propose the Domain-guided Fine-tuning (DogFit) method, an effective guidance mechanism for diffusion transfer learning that maintains controllability without incurring additional computational overhead. DogFit injects a domain-aware guidance offset into the training loss, effectively internalizing the guided behavior during the fine-tuning process. The domain-aware design is motivated by our observation that during fine-tuning, the unconditional source model offers a stronger…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
