Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks
Yunqi Zhang, Songda Li, Chunyuan Deng, Luyi Wang, Hui Zhao

TL;DR
This paper introduces GAMA, a two-stage framework that reduces gender bias in vision-language models by generating gender-obfuscated narratives and integrating them with images and questions for fairer task performance.
Contribution
We propose GAMA, a novel, task-agnostic framework that mitigates gender bias in vision-language models through narrative generation and answer inference stages.
Findings
GAMA effectively reduces gender bias in various vision-language tasks.
GAMA demonstrates strong generalization across different datasets.
The framework improves fairness without sacrificing accuracy.
Abstract
Gender bias in vision-language models (VLMs) can reinforce harmful stereotypes and discrimination. In this paper, we focus on mitigating gender bias towards vision-language tasks. We identify object hallucination as the essence of gender bias in VLMs. Existing VLMs tend to focus on salient or familiar attributes in images but ignore contextualized nuances. Moreover, most VLMs rely on the co-occurrence between specific objects and gender attributes to infer the ignored features, ultimately resulting in gender bias. We propose GAMA, a task-agnostic generation framework to mitigate gender bias. GAMA consists of two stages: narrative generation and answer inference. During narrative generation, GAMA yields all-sided but gender-obfuscated narratives, which prevents premature concentration on localized image features, especially gender attributes. During answer inference, GAMA integrates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsLanguage, Metaphor, and Cognition · Organizational Strategy and Culture · Education Practices and Challenges
MethodsFocus
