Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer   Level Loss

Yatharth Gupta; Vishnu V. Jaddipal; Harish Prabhala; Sayak Paul and; Patrick Von Platen

arXiv:2401.02677·cs.CV·January 8, 2024·2 cites

Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss

Yatharth Gupta, Vishnu V. Jaddipal, Harish Prabhala, Sayak Paul and, Patrick Von Platen

PDF

Open Access 1 Repo 3 Models

TL;DR

This paper presents a progressive knowledge distillation approach to create smaller, efficient versions of Stable Diffusion XL by removing layers and using layer-level losses, maintaining high image quality with fewer parameters.

Contribution

The authors introduce two compact SDXL variants achieved through layer-level loss-based progressive removal, enabling efficient deployment without significant quality loss.

Findings

01

Models achieve comparable quality to SDXL with fewer parameters

02

Significant reduction in latency and model size

03

Effective knowledge transfer from larger to smaller models

Abstract

Stable Diffusion XL (SDXL) has become the best open source text-to-image model (T2I) for its versatility and top-notch image quality. Efficiently addressing the computational demands of SDXL models is crucial for wider reach and applicability. In this work, we introduce two scaled-down variants, Segmind Stable Diffusion (SSD-1B) and Segmind-Vega, with 1.3B and 0.74B parameter UNets, respectively, achieved through progressive removal using layer-level losses focusing on reducing the model size while preserving generative quality. We release these models weights at https://hf.co/Segmind. Our methodology involves the elimination of residual networks and transformer blocks from the U-Net structure of SDXL, resulting in significant reductions in parameters, and latency. Our compact models effectively emulate the original SDXL by capitalizing on transferred knowledge, achieving competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

segmind/ssd-1b
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Max Pooling · Concatenated Skip Connection · Diffusion · Knowledge Distillation · U-Net