Cross-Cultural Value Awareness in Large Vision-Language Models
Phillip Howard, Xin Su, Kathleen C. Fraser

TL;DR
This paper investigates how large vision-language models recognize and reflect cultural differences in moral, ethical, and political values across various societal contexts using a comprehensive evaluation framework.
Contribution
It introduces a novel analysis of cultural stereotypes in LVLMs, focusing on their sensitivity to cultural contexts in moral and political value judgments.
Findings
LVLMs show varying awareness of cultural differences in value judgments.
Counterfactual images reveal biases and stereotypes in LVLM responses.
Evaluation framework effectively diagnoses cultural value sensitivity in LVLMs.
Abstract
The rapid adoption of large vision-language models (LVLMs) in recent years has been accompanied by growing fairness concerns due to their propensity to reinforce harmful societal stereotypes. While significant attention has been paid to such fairness concerns in the context of social biases, relatively little prior work has examined the presence of stereotypes in LVLMs related to cultural contexts such as religion, nationality, and socioeconomic status. In this work, we aim to narrow this gap by investigating how cultural contexts depicted in images influence the judgments LVLMs make about a person's moral, ethical, and political values. We conduct a multi-dimensional analysis of such value judgments in five popular LVLMs using counterfactual image sets, which depict the same person across different cultural contexts. Our evaluation framework diagnoses LVLM awareness of cultural value…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
