Neural Diffusion Models

Grigory Bartosh; Dmitry Vetrov; Christian A. Naesseth

arXiv:2310.08337·cs.LG·June 4, 2024·1 cites

Neural Diffusion Models

Grigory Bartosh, Dmitry Vetrov, Christian A. Naesseth

PDF

Open Access 3 Reviews

TL;DR

Neural Diffusion Models extend traditional diffusion models by enabling non-linear, time-dependent transformations, leading to improved likelihoods and sample quality in image generation tasks.

Contribution

The paper introduces Neural Diffusion Models, allowing non-linear transformations and providing a variational training method with a continuous-time formulation.

Findings

01

NDMs outperform conventional diffusion models on likelihood metrics.

02

NDMs generate higher quality images on benchmark datasets.

03

The continuous formulation enables efficient inference with standard solvers.

Abstract

Diffusion models have shown remarkable performance on many generative tasks. Despite recent success, most diffusion models are restricted in that they only allow linear transformation of the data distribution. In contrast, broader family of transformations can potentially help train generative distributions more efficiently, simplifying the reverse process and closing the gap between the true negative log-likelihood and the variational approximation. In this paper, we present Neural Diffusion Models (NDMs), a generalization of conventional diffusion models that enables defining and learning time-dependent non-linear transformations of data. We show how to optimise NDMs using a variational bound in a simulation-free setting. Moreover, we derive a time-continuous formulation of NDMs, which allows fast and reliable inference using off-the-shelf numerical ODE and SDE solvers. Finally, we…

Peer Reviews

Decision·ICML 2024 Poster

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

- The idea of generalizing diffusion models to learnable non-linear transformations is interesting. - Many previously proposed diffusion models and flow models are special cases of NDMs with specific choice of transformation. - The qualitative results of learned transformations for different datasets in Figure 2 are interesting. - NDM provides consistent gains in terms of NLL and NELBO over DDPM (See Table 4 and 7).

Weaknesses

- One of the primary motivations for learnable transformations is that it simplifies the data distribution and therefore leads to predictions of x that are more aligned with data. Ideally, if transformations indeed helped with simplification of data distribution, one should have observed better quantitive metrics in fewer sampling steps. However, the actual gains in quantitative metrics like NLL and NELBO seem marginal. Further, there seems to be no consistent gains in terms of FID. In addition,

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

- The background and techniques of NDMs are clearly described. - The visualization of the transformed data is insightful and pretty helpful in understanding the benefits of NDMs.

Weaknesses

- It seems that the technical details are similar to the conventional DMs. So, what are the technical challenges and novelties here? - In my opinion, the argument "a key limitation of most existing diffusion models is that they rely on a fixed and pre-specified forward process that is unable to adapt to the specific task or data at hand" is not convincing enough. The extensive empirical studies in the community reflect that conventional DMs have enough flexibility to accommodate diverse data. S

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

1. The idea of using parameterized marginal distribution makes a lot of sense, since we have no idea which configuration of the marginal distributions is the best and most of them are handcrafted. 2.The empirical performance successfully demonstrates the effectiveness of the proposed method in likelihood estimation.

Weaknesses

1. The presentation of the paper should be polished. The learning and sampling algorithm is difficult to find. I suggest the authors shorten/defer the discussion section to Appendix and add algorithm boxes in the main text. Moreover, I have several questions on the training process: (1) Are $\phi$ and $\theta$ jointly trained or alternatively trained? (2) How are the hyper-parameters of $F_\phi$ set? Are they similar to the x-prediction network? 2. Certain constraints of $F_\phi(x_t, t)$ should

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Advanced Neuroimaging Techniques and Applications

MethodsDiffusion