Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models
Xiaoliu Guan, Yu Wu, Huayang Huang, Xiao Liu, Jiaxu Miao, Yi Yang

TL;DR
This paper introduces a novel framework for diffusion models that reduces memorization and privacy risks by training through proxy models, selectively skipping and redistributing samples based on their loss values, and demonstrates significant memorization reduction without sacrificing quality.
Contribution
The paper proposes a generic, visual modality-focused method that mitigates memorization in diffusion models by proxy training, sample skipping, and redistribution techniques, improving privacy protection.
Findings
Reduces memorization score by 46.7% on fine-tuned models
Maintains high-quality image generation performance
Effective across four datasets
Abstract
Diffusion models, known for their tremendous ability to generate high-quality samples, have recently raised concerns due to their data memorization behavior, which poses privacy risks. Recent methods for memory mitigation have primarily addressed the issue within the context of the text modality in cross-modal generation tasks, restricting their applicability to specific conditions. In this paper, we propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization. Directly exposing visual data to the model increases memorization risk, so we design a framework where models learn through proxy model parameters instead. Specially, the training dataset is divided into multiple shards, with each shard training a proxy model, then aggregated to form the final model. Additionally, practical analysis of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis
MethodsDiffusion
