See It from My Perspective: How Language Affects Cultural Bias in Image   Understanding

Amith Ananthram; Elias Stengel-Eskin; Mohit Bansal; Kathleen McKeown

arXiv:2406.11665·cs.CL·March 4, 2025·3 cites

See It from My Perspective: How Language Affects Cultural Bias in Image Understanding

Amith Ananthram, Elias Stengel-Eskin, Mohit Bansal, Kathleen McKeown

PDF

Open Access 1 Repo 6 Models 1 Video

TL;DR

This paper investigates how cultural biases influence vision-language models' image understanding, revealing Western bias and the importance of diverse language representation in reducing such biases.

Contribution

It characterizes cultural bias in VLMs, identifies language diversity as a key factor, and demonstrates bias reduction through well-represented languages during training.

Findings

01

VLMs perform better on Western images than East Asian images.

02

Language diversity in training reduces cultural bias in models.

03

Bias can be mitigated even when prompting in English if the language was well-represented during training.

Abstract

Vision-language models (VLMs) can respond to queries about images in many languages. However, beyond language, culture affects how we see things. For example, individuals from Western cultures focus more on the central figure in an image while individuals from East Asian cultures attend more to scene context. In this work, we characterize the Western bias of VLMs in image understanding and investigate the role that language plays in this disparity. We evaluate VLMs across subjective and objective visual tasks with culturally diverse images and annotations. We find that VLMs perform better on the Western split than on the East Asian split of each task. Through controlled experimentation, we trace one source of this bias in image understanding to the lack of diversity in language model construction. While inference in a language nearer to a culture can lead to reductions in bias, we show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amith-ananthram/see-it-from-my-perspective
pytorchOfficial

Models

Videos

See It from My Perspective: How Language Affects Cultural Bias in Image Understanding· slideslive

Taxonomy

TopicsLanguage, Metaphor, and Cognition

MethodsFocus