Uncovering Name-Based Biases in Large Language Models Through Simulated   Trust Game

Yumou Wei; Paulo F. Carvalho; John Stamper

arXiv:2404.14682·cs.CY·April 24, 2024

Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game

Yumou Wei, Paulo F. Carvalho, John Stamper

PDF

Open Access

TL;DR

This paper investigates whether large language models exhibit name-based biases similar to humans by using a simulated Trust Game with carefully curated names, revealing biases in both base and instruction-tuned models.

Contribution

It introduces a novel method to detect name-based biases in language models through a social interaction simulation, extending bias analysis beyond word representations.

Findings

01

Models show biases consistent with societal stereotypes.

02

Bias detection is effective in both base and instruction-tuned models.

03

The approach validates the presence of subtle social biases in language models.

Abstract

Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. Abundant evidence from human experiments has revealed the preferential treatment that one receives when one's name suggests a predominant gender or race. As large language models acquire more capabilities and begin to support everyday applications, it becomes crucial to examine whether they manifest similar biases when encountering names in a complex social interaction. In contrast to previous work that studies name-based biases in language models at a more fundamental level, such as word representations, we challenge three prominent models to predict the outcome of a modified Trust Game, a well-publicized paradigm for studying trust and reciprocity. To ensure the internal validity of our experiments, we have carefully curated a list of racially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling