Noise Consistency Regularization for Improved Subject-Driven Image Synthesis
Yao Ni, Song Wen, Piotr Koniusz, Anoop Cherian

TL;DR
This paper introduces auxiliary consistency losses for fine-tuning Stable Diffusion, effectively balancing subject identity preservation and background diversity in subject-driven image synthesis.
Contribution
It proposes two novel regularization losses that improve fidelity and diversity during diffusion model fine-tuning, outperforming existing methods like DreamBooth.
Findings
Enhanced subject identity preservation
Increased background diversity
Outperforms DreamBooth in CLIP scores and visual quality
Abstract
Fine-tuning Stable Diffusion enables subject-driven image synthesis by adapting the model to generate images containing specific subjects. However, existing fine-tuning methods suffer from two key issues: underfitting, where the model fails to reliably capture subject identity, and overfitting, where it memorizes the subject image and reduces background diversity. To address these challenges, we propose two auxiliary consistency losses for diffusion fine-tuning. First, a prior consistency regularization loss ensures that the predicted diffusion noise for prior (non-subject) images remains consistent with that of the pretrained model, improving fidelity. Second, a subject consistency regularization loss enhances the fine-tuned model's robustness to multiplicative noise modulated latent code, helping to preserve subject identity while improving diversity. Our experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Computer Graphics and Visualization Techniques
MethodsDiffusion · Contrastive Language-Image Pre-training
