Optimizing Input of Denoising Score Matching is Biased Towards Higher Score Norm
Tongda Xu

TL;DR
This paper reveals that optimizing the input of denoising score matching introduces a bias towards higher score norms, affecting various diffusion-based applications across multiple domains.
Contribution
It demonstrates that such optimization breaks the equivalence with exact score matching and causes a bias towards higher score norms, highlighting a fundamental issue in current diffusion model training.
Findings
Optimization biases lead to higher score norms.
The bias affects multiple diffusion-based applications.
The equivalence between denoising and exact score matching is broken.
Abstract
Many recent works utilize denoising score matching to optimize the conditional input of diffusion models. In this workshop paper, we demonstrate that such optimization breaks the equivalence between denoising score matching and exact score matching. Furthermore, we show that this bias leads to higher score norm. Additionally, we observe a similar bias when optimizing the data distribution using a pre-trained diffusion model. Finally, we discuss the wide range of works across different domains that are affected by this bias, including MAR for auto-regressive generation, PerCo for image compression, and DreamFusion for text to 3D generation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning
