GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language Models
Ali Abdollahi, Mahdi Ghaznavi, Mohammad Reza Karimi Nejad, Arash Mari, Oriyad, Reza Abbasi, Ali Salesi, Melika Behjati, Mohammad Hossein Rohban and, Mahdieh Soleymani Baghshah

TL;DR
This paper investigates gender-activity bias in vision-language models, introduces a new dataset to evaluate this bias, and finds that such bias significantly impacts model performance in complex scenarios.
Contribution
The study introduces the GAB dataset for assessing gender-activity bias and analyzes how this bias affects multiple pre-trained VLMs' performance.
Findings
VLMs show a 13.2% performance decline due to GAB bias.
The GAB dataset contains approximately 5500 AI-generated images.
Bias impacts both image and text retrieval tasks.
Abstract
Vision-language models (VLMs) are intensively used in many downstream tasks, including those requiring assessments of individuals appearing in the images. While VLMs perform well in simple single-person scenarios, in real-world applications, we often face complex situations in which there are persons of different genders doing different activities. We show that in such cases, VLMs are biased towards identifying the individual with the expected gender (according to ingrained gender stereotypes in the model or other forms of sample selection bias) as the performer of the activity. We refer to this bias in associating an activity with the gender of its actual performer in an image or text as the Gender-Activity Binding (GAB) bias and analyze how this bias is internalized in VLMs. To assess this bias, we have introduced the GAB dataset with approximately 5500 AI-generated images that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage, Metaphor, and Cognition · Educational Research and Analysis
MethodsFast Attention Via Positive Orthogonal Random Features · Performer
