PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation

Ziyan Wang; Sizhe Wei; Xiaoming Huo; Hao Wang

arXiv:2502.08106·cs.LG·June 17, 2025

PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation

Ziyan Wang, Sizhe Wei, Xiaoming Huo, Hao Wang

PDF

Open Access

TL;DR

PoGDiff introduces a novel fine-tuning method for diffusion models that uses a Product of Gaussians to better handle imbalanced text-to-image datasets, leading to improved generation quality.

Contribution

The paper proposes PoGDiff, a new approach that replaces ground-truth distributions with a Product of Gaussians for better imbalanced data handling in diffusion models.

Findings

01

Enhanced image generation quality on imbalanced datasets

02

Improved accuracy in text-to-image synthesis

03

Effective mitigation of data imbalance issues

Abstract

Diffusion models have made significant advancements in recent years. However, their performance often deteriorates when trained or fine-tuned on imbalanced datasets. This degradation is largely due to the disproportionate representation of majority and minority data in image-text pairs. In this paper, we propose a general fine-tuning approach, dubbed PoGDiff, to address this challenge. Rather than directly minimizing the KL divergence between the predicted and ground-truth distributions, PoGDiff replaces the ground-truth distribution with a Product of Gaussians (PoG), which is constructed by combining the original ground-truth targets with the predicted distribution conditioned on a neighboring text embedding. Experiments on real-world datasets demonstrate that our method effectively addresses the imbalance problem in diffusion models, improving both generation accuracy and quality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Advanced Steganography and Watermarking Techniques

MethodsDiffusion