Noise Estimation for Generative Diffusion Models
Robin San-Roman, Eliya Nachmani, Lior Wolf

TL;DR
This paper introduces a simple, versatile learning scheme for tuning noise parameters in diffusion models, enabling improved synthesis with fewer steps without retraining the model.
Contribution
It proposes a step-by-step noise parameter adjustment method that enhances diffusion model performance for various step counts without weight modifications.
Findings
Significantly improves synthesis quality with fewer denoising steps
Requires negligible additional computation
Works without retuning for different step numbers
Abstract
Generative diffusion models have emerged as leading models in speech and image generation. However, in order to perform well with a small number of denoising steps, a costly tuning of the set of noise parameters is needed. In this work, we present a simple and versatile learning scheme that can step-by-step adjust those noise parameters, for any given number of steps, while the previous work needs to retune for each number separately. Furthermore, without modifying the weights of the diffusion model, we are able to significantly improve the synthesis results, for a small number of steps. Our approach comes at a negligible computation cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing
MethodsDiffusion
