BiasDora: Exploring Hidden Biased Associations in Vision-Language Models
Chahat Raj, Anjishnu Mukherjee, Aylin Caliskan, Antonios, Anastasopoulos, Ziwei Zhu

TL;DR
BiasDora investigates hidden implicit biases in vision-language models across multiple bias dimensions, revealing subtle and extreme biases often overlooked by existing methods, and provides a dataset for further research.
Contribution
The paper introduces a systematic approach to uncover hidden implicit biases in VLMs across 9 bias dimensions, expanding beyond documented associations.
Findings
Identifies subtle and extreme biases in VLMs
Reveals variation in negativity, toxicity, and extremity of biases
Provides a publicly available dataset of bias associations
Abstract
Existing works examining Vision-Language Models (VLMs) for social biases predominantly focus on a limited set of documented bias associations, such as gender:profession or race:crime. This narrow scope often overlooks a vast range of unexamined implicit associations, restricting the identification and, hence, mitigation of such biases. We address this gap by probing VLMs to (1) uncover hidden, implicit associations across 9 bias dimensions. We systematically explore diverse input and output modalities and (2) demonstrate how biased associations vary in their negativity, toxicity, and extremity. Our work (3) identifies subtle and extreme biases that are typically not recognized by existing methodologies. We make the Dataset of retrieved associations, (Dora), publicly available here https://github.com/chahatraj/BiasDora.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling
MethodsSparse Evolutionary Training · Focus
