Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models
Korada Sri Vardhana, Shrikrishna Lolla, Soma Biswas

TL;DR
SelfDebias is a fully unsupervised method that reduces biases in text-to-image diffusion models during inference by automatically identifying semantic clusters, improving fairness without requiring labeled data.
Contribution
It introduces a novel test-time debiasing approach that automatically detects semantic modes to mitigate biases in diffusion models without supervision.
Findings
Effectively debiases images across demographic dimensions
Maintains high visual fidelity of generated images
Generalizes across different diffusion architectures
Abstract
Text-to-image (T2I) diffusion models have achieved widespread success due to their ability to generate high-resolution, photorealistic images. These models are trained on large-scale datasets, like LAION-5B, often scraped from the internet. However, since this data contains numerous biases, the models inherently learn and reproduce them, resulting in stereotypical outputs. We introduce SelfDebias, a fully unsupervised test-time debiasing method applicable to any diffusion model that uses a UNet as its noise predictor. SelfDebias identifies semantic clusters in an image encoder's embedding space and uses these clusters to guide the diffusion process during inference, minimizing the KL divergence between the output distribution and the uniform distribution. Unlike supervised approaches, SelfDebias does not require human-annotated datasets or external classifiers trained for each generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Advanced Neuroimaging Techniques and Applications
