Diminishing Stereotype Bias in Image Generation Model using   Reinforcemenlent Learning Feedback

Xin Chen; Virgile Foussereau

arXiv:2407.09551·cs.CV·July 16, 2024

Diminishing Stereotype Bias in Image Generation Model using Reinforcemenlent Learning Feedback

Xin Chen, Virgile Foussereau

PDF

Open Access

TL;DR

This paper presents a reinforcement learning approach using feedback to reduce gender bias in image generation models, maintaining image quality without extra data or prompt changes.

Contribution

Introduces a novel RLAIF method with DDPO pipeline and new reward functions to mitigate gender bias in diffusion-based image generation.

Findings

01

Effective bias mitigation without quality loss

02

No additional data or prompt modifications needed

03

Foundation for addressing various AI biases

Abstract

This study addresses gender bias in image generation models using Reinforcement Learning from Artificial Intelligence Feedback (RLAIF) with a novel Denoising Diffusion Policy Optimization (DDPO) pipeline. By employing a pretrained stable diffusion model and a highly accurate gender classification Transformer, the research introduces two reward functions: Rshift for shifting gender imbalances, and Rbalance for achieving and maintaining gender balance. Experiments demonstrate the effectiveness of this approach in mitigating bias without compromising image quality or requiring additional data or prompt modifications. While focusing on gender bias, this work establishes a foundation for addressing various forms of bias in AI systems, emphasizing the need for responsible AI development. Future research directions include extending the methodology to other bias types, enhancing the RLAIF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Systems and Machine Learning

MethodsAttention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Diffusion · Adam · Dropout · Multi-Head Attention