Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion   Models

Sanghyun Kim; Seohyeon Jung; Balhae Kim; Moonseok Choi; Jinwoo Shin,; Juho Lee

arXiv:2307.05977·cs.CV·July 13, 2023·5 cites

Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models

Sanghyun Kim, Seohyeon Jung, Balhae Kim, Moonseok Choi, Jinwoo Shin,, Juho Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces SDD, a self-distillation method for text-to-image diffusion models that effectively reduces harmful content and enables simultaneous removal of multiple concepts without sacrificing image quality.

Contribution

The proposed SDD method advances safe content generation by enabling multi-concept removal and significantly reducing harmful outputs in diffusion models.

Findings

01

Reduces harmful content in generated images

02

Allows removal of multiple concepts simultaneously

03

Maintains high image quality

Abstract

Large-scale image generation models, with impressive quality made possible by the vast amount of data available on the Internet, raise social concerns that these models may generate harmful or copyrighted content. The biases and harmfulness arise throughout the entire training process and are hard to completely remove, which have become significant hurdles to the safe deployment of these models. In this paper, we propose a method called SDD to prevent problematic content generation in text-to-image diffusion models. We self-distill the diffusion model to guide the noise estimate conditioned on the target removal concept to match the unconditional one. Compared to the previous methods, our method eliminates a much greater proportion of harmful content from the generated images without degrading the overall image quality. Furthermore, our method allows the removal of multiple concepts at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nannullna/safe-diffusion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsDiffusion