Noise Estimation for Generative Diffusion Models

Robin San-Roman; Eliya Nachmani; Lior Wolf

arXiv:2104.02600·cs.LG·September 14, 2021·38 cites

Noise Estimation for Generative Diffusion Models

Robin San-Roman, Eliya Nachmani, Lior Wolf

PDF

Open Access

TL;DR

This paper introduces a simple, versatile learning scheme for tuning noise parameters in diffusion models, enabling improved synthesis with fewer steps without retraining the model.

Contribution

It proposes a step-by-step noise parameter adjustment method that enhances diffusion model performance for various step counts without weight modifications.

Findings

01

Significantly improves synthesis quality with fewer denoising steps

02

Requires negligible additional computation

03

Works without retuning for different step numbers

Abstract

Generative diffusion models have emerged as leading models in speech and image generation. However, in order to perform well with a small number of denoising steps, a costly tuning of the set of noise parameters is needed. In this work, we present a simple and versatile learning scheme that can step-by-step adjust those noise parameters, for any given number of steps, while the previous work needs to retune for each number separately. Furthermore, without modifying the weights of the diffusion model, we are able to significantly improve the synthesis results, for a small number of steps. Our approach comes at a negligible computation cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing

MethodsDiffusion