Redistribute Ensemble Training for Mitigating Memorization in Diffusion   Models

Xiaoliu Guan; Yu Wu; Huayang Huang; Xiao Liu; Jiaxu Miao; Yi Yang

arXiv:2502.09434·cs.CV·February 14, 2025

Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models

Xiaoliu Guan, Yu Wu, Huayang Huang, Xiao Liu, Jiaxu Miao, Yi Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel framework for diffusion models that reduces memorization and privacy risks by training through proxy models, selectively skipping and redistributing samples based on their loss values, and demonstrates significant memorization reduction without sacrificing quality.

Contribution

The paper proposes a generic, visual modality-focused method that mitigates memorization in diffusion models by proxy training, sample skipping, and redistribution techniques, improving privacy protection.

Findings

01

Reduces memorization score by 46.7% on fine-tuned models

02

Maintains high-quality image generation performance

03

Effective across four datasets

Abstract

Diffusion models, known for their tremendous ability to generate high-quality samples, have recently raised concerns due to their data memorization behavior, which poses privacy risks. Recent methods for memory mitigation have primarily addressed the issue within the context of the text modality in cross-modal generation tasks, restricting their applicability to specific conditions. In this paper, we propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization. Directly exposing visual data to the model increases memorization risk, so we design a framework where models learn through proxy model parameters instead. Specially, the training dataset is divided into multiple shards, with each shard training a proxy model, then aggregated to form the final model. Additionally, practical analysis of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liuxiao-guan/iet_agc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis

MethodsDiffusion