Uncovering Bias in Large Vision-Language Models at Scale with   Counterfactuals

Phillip Howard; Kathleen C. Fraser; Anahita Bhiwandiwalla; Svetlana; Kiritchenko

arXiv:2405.20152·cs.CV·May 1, 2025

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals

Phillip Howard, Kathleen C. Fraser, Anahita Bhiwandiwalla, Svetlana, Kiritchenko

PDF

Open Access 1 Video

TL;DR

This study investigates social biases in large vision-language models by analyzing over 57 million responses to counterfactual image changes, revealing biases related to race, gender, and physical traits influencing generated content.

Contribution

The paper introduces a large-scale bias evaluation framework for LVLMs using counterfactual image modifications, highlighting biases across multiple social attributes.

Findings

01

Biases related to race, gender, and physical traits significantly affect generated content.

02

Counterfactual image changes reveal influence of visual attributes on toxicity and stereotypes.

03

Large-scale analysis uncovers systematic biases in popular LVLMs.

Abstract

With the advent of Large Language Models (LLMs) possessing increasingly impressive capabilities, a number of Large Vision-Language Models (LVLMs) have been proposed to augment LLMs with visual inputs. Such models condition generated text on both an input image and a text prompt, enabling a variety of use cases such as visual question answering and multimodal chat. While prior studies have examined the social biases contained in text generated by LLMs, this topic has been relatively unexplored in LVLMs. Examining social biases in LVLMs is particularly challenging due to the confounding contributions of bias induced by information contained across the text and visual modalities. To address this challenging problem, we conduct a large-scale study of text generated by different LVLMs under counterfactual changes to input images, producing over 57 million responses from popular models. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals· underline

Taxonomy

TopicsText Readability and Simplification

MethodsSparse Evolutionary Training