One-Step Diffusion Distillation via Deep Equilibrium Models

Zhengyang Geng; Ashwini Pokle; J. Zico Kolter

arXiv:2401.08639·cs.CV·January 18, 2024·2 cites

One-Step Diffusion Distillation via Deep Equilibrium Models

Zhengyang Geng, Ashwini Pokle, J. Zico Kolter

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a novel one-step diffusion model distillation method using Deep Equilibrium models, achieving high-quality image generation with minimal training complexity and improved efficiency over existing approaches.

Contribution

The introduction of the Generative Equilibrium Transformer (GET) as a DEQ-based architecture for direct diffusion model distillation is a key innovation.

Findings

01

GET matches larger ViT in FID scores

02

Method enables fully offline training with noise/image pairs

03

Outperforms existing one-step methods on comparable budgets

Abstract

Diffusion models excel at producing high-quality samples but naively require hundreds of iterations, prompting multiple attempts to distill the generation process into a faster network. However, many existing approaches suffer from a variety of challenges: the process for distillation training can be complex, often requiring multiple training stages, and the resulting models perform poorly when utilized in single-step generative applications. In this paper, we introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image. Of particular importance to our approach is to leverage a new Deep Equilibrium (DEQ) model as the distilled architecture: the Generative Equilibrium Transformer (GET). Our method enables fully offline training with just noise/image pairs from the diffusion model while achieving superior performance compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

locuslab/get
pytorchOfficial

Videos

One-Step Diffusion Distillation via Deep Equilibrium Models· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Advanced Mathematical Modeling in Engineering

MethodsMulti-Head Attention · Attention Is All You Need · Label Smoothing · Absolute Position Encodings · Layer Normalization · Dropout · Linear Layer · Byte Pair Encoding · Softmax · Adam