Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective
Zhaotian Weng, Zijun Gao, Jerone Andrews, Jieyu Zhao

TL;DR
This paper introduces a causal mediation analysis framework to understand and mitigate bias in vision-language models, revealing that image features are the main bias source and that targeted interventions can significantly reduce bias.
Contribution
The study applies causal mediation analysis to identify bias pathways in VLMs, highlighting image features as primary bias contributors and proposing effective bias mitigation strategies.
Findings
Image features contribute over 32% to bias in datasets.
Focusing on image encoder reduces bias by over 22%.
Interventions cause minimal performance loss.
Abstract
Vision-language models (VLMs) pre-trained on extensive datasets can inadvertently learn biases by correlating gender information with specific objects or scenarios. Current methods, which focus on modifying inputs and monitoring changes in the model's output probability scores, often struggle to comprehensively understand bias from the perspective of model components. We propose a framework that incorporates causal mediation analysis to measure and map the pathways of bias generation and propagation within VLMs. This approach allows us to identify the direct effects of interventions on model bias and the indirect effects of interventions on bias mediated through different model components. Our results show that image features are the primary contributors to bias, with significantly higher impacts than text features, specifically accounting for 32.57% and 12.63% of the bias in the MSCOCO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsCategorization, perception, and language
MethodsFocus
