Denoising Diffusion Gamma Models

Eliya Nachmani; Robin San Roman; Lior Wolf

arXiv:2110.05948·eess.SP·October 13, 2021·6 cites

Denoising Diffusion Gamma Models

Eliya Nachmani, Robin San Roman, Lior Wolf

PDF

Open Access

TL;DR

This paper introduces the Denoising Diffusion Gamma Model (DDGM), replacing Gaussian noise with Gamma noise in diffusion processes, leading to improved image and speech generation results.

Contribution

It proposes a novel diffusion model using Gamma noise, expanding beyond Gaussian assumptions for better generative performance.

Findings

01

Gamma noise improves image generation quality

02

Gamma diffusion enhances speech synthesis results

03

Efficient sampling is maintained with Gamma noise

Abstract

Generative diffusion processes are an emerging and effective tool for image and speech generation. In the existing methods, the underlying noise distribution of the diffusion process is Gaussian noise. However, fitting distributions with more degrees of freedom could improve the performance of such generative models. In this work, we investigate other types of noise distribution for the diffusion process. Specifically, we introduce the Denoising Diffusion Gamma Model (DDGM) and show that noise from Gamma distribution provides improved results for image and speech generation. Our approach preserves the ability to efficiently sample state in the training diffusion process while using Gamma noise.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion