Refusal as Silence: Gendered Disparities in Vision-Language Model Responses
Sha Luo, Sang Jung Kim, Zening Duan, Kaiping Chen

TL;DR
This paper explores how gender identity influences refusal behavior in vision-language models, revealing disparities that impact fairness and access, and highlighting methodological considerations for equity in AI systems.
Contribution
It introduces a counterfactual persona approach to analyze gendered disparities in model refusals, advancing understanding of bias in AI content moderation.
Findings
Transgender and non-binary personas face higher refusal rates.
Refusal disparities persist even in non-harmful contexts.
Methodological insights for equity audits using LLMs.
Abstract
Refusal behavior by Large Language Models is increasingly visible in content moderation, yet little is known about how refusals vary by the identity of the user making the request. This study investigates refusal as a sociotechnical outcome through a counterfactual persona design that varies gender identity--including male, female, non-binary, and transgender personas--while keeping the classification task and visual input constant. Focusing on a vision-language model (GPT-4V), we examine how identity-based language cues influence refusal in binary gender classification tasks. We find that transgender and non-binary personas experience significantly higher refusal rates, even in non-harmful contexts. Our findings also provide methodological implications for equity audits and content analysis using LLMs. Our findings underscore the importance of modeling identity-driven disparities and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsAttention Is All You Need · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
