Bias Beyond Demographics: Probing Decision Boundaries in Black-Box LVLMs via Counterfactual VQA
Zaiying Zhao, Toshihiko Yamasaki

TL;DR
This paper introduces a counterfactual VQA benchmark to evaluate fairness in black-box LVLMs, revealing that non-demographic contextual factors influence decision-making more than demographic attributes, and explores debiasing strategies.
Contribution
It presents a novel counterfactual VQA benchmark for probing decision boundaries and biases in LVLMs, expanding fairness evaluation beyond demographics.
Findings
Non-demographic attributes distort LVLM decisions more than demographic ones.
Instruction-based debiasing has limited effectiveness and may amplify biases.
Few-shot human norm examples improve model consistency and fairness.
Abstract
Recent advances in large vision-language models (LVLMs) have amplified concerns about fairness, yet existing evaluations remain confined to demographic attributes and often conflate fairness with refusal behavior. This paper broadens the scope of fairness by introducing a counterfactual VQA benchmark that probes the decision boundaries of closed-source LVLMs under controlled context shifts. Each image pair differs in a single visual attribute that has been validated as irrelevant to the question, enabling ground-truth-free and refusal-aware analysis of reasoning stability. Comprehensive experiments reveal that non-demographic attributes, such as environmental context or social behavior, distort LVLM decision-making more strongly than demographic ones. Moreover, instruction-based debiasing shows limited effectiveness and can even amplify these asymmetries, whereas exposure to a small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
