Harnessing Diffusion-Generated Synthetic Images for Fair Image Classification

Abhipsa Basu; Aviral Gupta; Abhijnya Bhat; R. Venkatesh Babu

arXiv:2511.08711·cs.CV·December 2, 2025

Harnessing Diffusion-Generated Synthetic Images for Fair Image Classification

Abhipsa Basu, Aviral Gupta, Abhijnya Bhat, R. Venkatesh Babu

PDF

Open Access

TL;DR

This paper investigates diffusion model finetuning techniques to generate balanced synthetic images, improving fairness in image classification by reducing dataset bias and outperforming existing debiasing methods especially on highly biased datasets.

Contribution

It introduces multiple diffusion-finetuning methods, including clustering, to generate more representative balanced data, enhancing fairness in image classification.

Findings

01

Finetuning diffusion models improves data balance.

02

Generated data enhances classification fairness.

03

Outperforms vanilla diffusion and matches state-of-the-art debiasing methods.

Abstract

Image classification systems often inherit biases from uneven group representation in training data. For example, in face datasets for hair color classification, blond hair may be disproportionately associated with females, reinforcing stereotypes. A recent approach leverages the Stable Diffusion model to generate balanced training data, but these models often struggle to preserve the original data distribution. In this work, we explore multiple diffusion-finetuning techniques, e.g., LoRA and DreamBooth, to generate images that more accurately represent each training group by learning directly from their samples. Additionally, in order to prevent a single DreamBooth model from being overwhelmed by excessive intra-group variations, we explore a technique of clustering images within each group and train a DreamBooth model per cluster. These models are then used to generate group-balanced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Face and Expression Recognition