TL;DR
This paper introduces a novel method to accelerate diffusion and flow model sampling by rewiring neural network blocks and controlling the quality-complexity tradeoff through length consistency, achieving faster and more memory-efficient generation.
Contribution
It proposes ODE$_t$(ODE$_l$), a solver-agnostic approach that rewires transformer blocks and uses length consistency during training to reduce sampling latency and memory usage.
Findings
Up to 2x latency reduction in sampling.
FID improvement of up to 2.8 points on CelebA-HQ and ImageNet.
Applicable to existing diffusion and flow models for faster, memory-efficient sampling.
Abstract
Continuous normalizing flows (CNFs) and diffusion models (DMs) generate high-quality data from a noise distribution. However, their sampling process demands multiple iterations to solve an ordinary differential equation (ODE) with high computational complexity. State-of-the-art methods focus on reducing the number of discrete time steps during sampling to improve efficiency. In this work, we explore a complementary direction in which the quality-complexity tradeoff can also be controlled in terms of the neural network length. We achieve this by rewiring the blocks in the transformer-based architecture to solve an inner discretized ODE w.r.t. its depth. Then, we apply a length consistency term during flow matching training, and as a result, the sampling can be performed with an arbitrary number of time steps and transformer blocks. Unlike others, our ODE(ODE) approach is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsNormalizing Flows · Diffusion · Focus
