Generative Lines Matching Models

Ori Matityahu; Raanan Fattal

arXiv:2412.06403·cs.CV·December 10, 2024

Generative Lines Matching Models

Ori Matityahu, Raanan Fattal

PDF

Open Access 4 Reviews

TL;DR

This paper introduces the Lines Matching Model (LMM), a novel probability flow approach that improves sampling efficiency and fidelity in generative models by matching straight lines between source and target distributions, achieving state-of-the-art results.

Contribution

The paper proposes the LMM, a new flow-based model that matches straight lines between distributions, addressing singularities and degeneracies in denoising models, with enhanced sampling and sample quality.

Findings

01

LMM achieves state-of-the-art FID scores on benchmarks.

02

LMM exhibits highly straight and temporally consistent trajectories.

03

Theoretical analysis reveals limitations of optimal transport in high dimensions.

Abstract

In this paper we identify the source of a singularity in the training loss of key denoising models, that causes the denoiser's predictions to collapse towards the mean of the source or target distributions. This degeneracy creates false basins of attraction, distorting the denoising trajectories and ultimately increasing the number of steps required to sample these models. We circumvent this artifact by leveraging the deterministic ODE-based samplers, offered by certain denoising diffusion and score-matching models, which establish a well-defined change-of-variables between the source and target distributions. Given this correspondence, we propose a new probability flow model, the Lines Matching Model (LMM), which matches globally straight lines interpolating the two distributions. We demonstrate that the flow fields produced by the LMM exhibit notable temporal consistency, resulting…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 3Confidence 4

Strengths

- The pairs produced by diffusion ODE provide clearer signals than the diffusion model prediction, and the perceptual loss/adversarial loss are common techniques targeting the enhancement of FID. It is no surprise that they can bring improvements - The 1-step and 2-step results on standard image datasets are better than previous works.

Weaknesses

- A large portion of the writing is to explain the flaw of diffusion models in few-step generation and the advantage of straightness. However, these ideas are largely from previous works like Flow Matching, Stochastic Interpolants, Rectified Flow, Consistency Models, and are well-known to the audience who are familiar with diffusion models. The authors should simplify this part and focus on the difference with prior work. - The authors make claims which seem to convince the readers that diffusio

Reviewer 02Rating 3Confidence 3

Strengths

The presentation is clear and easy to follow. The comparisons of baseline approaches illustrated in Table 1 are both informative and interesting. I appreciate the analysis in Appendix A.2. Wrapping it up as a theorem would be even clearer to understand the result and claim.

Weaknesses

1. Please be mindful of repeated notations. For instance, $N$ is used both as the maximum timestep and to denote the neural network while ignoring inputs. 2. The proposed method advocates for directly matching the network $ N_{\theta}(\cdot, \sigma)$ prediction with the generated sample from the teacher model $N^*_{\text{sampler}}$ in Eq. (3). However, creating the pairs $ (x_0, N^*_{\text{sampler}}(x_0)) $ may be expensive (offline and online), as it involves solving ODEs. Please also compare

Reviewer 03Rating 3Confidence 3

Strengths

The paper provides a new perspective on straight trajectory.

Weaknesses

- Let's focus on the setting that $p_0(x_0)$ is the prior distribution. Then, the author's claim is that, with the coupling of $(x_0,x_1)$, where $x_1$ is the solution of PF-ODE, works better than previous algorithms. What does the neural network learns? Does it learn some trajectory? Which trajectory does it learn? How to sample with multi-steps? If it sample like multistep CM, then what's the whole point of having straighter trajectories? - I personally am not fully convinced of the whole sto

Reviewer 04Rating 5Confidence 4

Strengths

The paper is well written. The motivation is clear and proposed technique is reasonable.

Weaknesses

1. The main idea of LMM is to utilize the ability of pretrained diffusion model to build matched data pairs that replace the random selected data pairs used in the original diffusion models. To me, this idea seems to have been already proposed and explored by many previous distillation works like [1,2,3]. The difference is that, works like [1,2] also consider the computational cost of this distillation. Thus, they only run one or two denoising steps instead of running the denoising process till

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenome Rearrangement Algorithms · Bayesian Methods and Mixture Models

MethodsSparse Evolutionary Training · Diffusion