FPAN: Mitigating Replication in Diffusion Models through the Fine-Grained Probabilistic Addition of Noise to Token Embeddings

Jingqi Xu; Chenghao Li; Yuke Zhang; and Peter A. Beerel

arXiv:2505.21848·cs.CV·May 29, 2025

FPAN: Mitigating Replication in Diffusion Models through the Fine-Grained Probabilistic Addition of Noise to Token Embeddings

Jingqi Xu, Chenghao Li, Yuke Zhang, and Peter A. Beerel

PDF

Open Access

TL;DR

This paper introduces FPAN, a novel noise injection method that probabilistically adds noise to token embeddings in diffusion models, significantly reducing data replication while maintaining image quality.

Contribution

The paper proposes FPAN, a fine-grained probabilistic noise addition technique that effectively mitigates data replication in diffusion models, outperforming previous methods.

Findings

01

Reduces data replication by 28.78% on average

02

Outperforms prior noise addition approach by 26.51%

03

Further reduces replication by up to 16.82% when combined with other methods

Abstract

Diffusion models have demonstrated remarkable potential in generating high-quality images. However, their tendency to replicate training data raises serious privacy concerns, particularly when the training datasets contain sensitive or private information. Existing mitigation strategies primarily focus on reducing image duplication, modifying the cross-attention mechanism, and altering the denoising backbone architecture of diffusion models. Moreover, recent work has shown that adding a consistent small amount of noise to text embeddings can reduce replication to some degree. In this work, we begin by analyzing the impact of adding varying amounts of noise. Based on our analysis, we propose a fine-grained noise injection technique that probabilistically adds a larger amount of noise to token embeddings. We refer to our method as Fine-grained Probabilistic Addition of Noise (FPAN).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Privacy-Preserving Technologies in Data · Advanced Steganography and Watermarking Techniques