TL;DR
This paper introduces a Lipschitz-based regularization method for personalized text-to-image diffusion models that prevents distributional drift, improves fidelity, and maintains diversity during concept adaptation.
Contribution
It proposes a novel regularization technique that constrains model updates, ensuring distributional consistency while enabling effective personalization of diffusion models.
Findings
Achieves superior visual fidelity and prompt adherence in experiments.
Reduces overfitting and preserves diversity compared to prior methods.
Offers a computationally efficient alternative to resource-intensive sampling.
Abstract
Personalizing text-to-image diffusion models involves integrating novel visual concepts from a small set of reference images while retaining the model's original generative capabilities. However, this process often leads to overfitting, where the model ignores the user's prompt and merely replicates the reference images. We attribute this issue to a fundamental misalignment between the true goals of personalization, which are subject fidelity and text alignment, and the training objectives of existing methods that fail to enforce both objectives simultaneously. Specifically, prior approaches often overlook the need to explicitly preserve the pretrained model's output distribution, resulting in distributional drift that undermines diversity and coherence. To resolve these challenges, we introduce a Lipschitz-based regularization objective that constrains parameter updates during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
