Understanding Intrinsic Socioeconomic Biases in Large Language Models

Mina Arzaghi; Florian Carichon; Golnoosh Farnadi

arXiv:2405.18662·cs.CL·May 30, 2024

Understanding Intrinsic Socioeconomic Biases in Large Language Models

Mina Arzaghi, Florian Carichon, Golnoosh Farnadi

PDF

Open Access

TL;DR

This paper investigates socioeconomic biases in large language models, revealing pervasive biases across models like GPT-2 and Llama 2, especially when considering demographic intersectionality, and emphasizes the need for bias mitigation.

Contribution

Introduces a large dataset to systematically quantify socioeconomic biases in LLMs and analyzes how intersectionality amplifies these biases.

Findings

01

Biases are pervasive across multiple LLMs.

02

Intersectionality significantly amplifies biases.

03

Models can extract demographic attributes and associate them with biases.

Abstract

Large Language Models (LLMs) are increasingly integrated into critical decision-making processes, such as loan approvals and visa applications, where inherent biases can lead to discriminatory outcomes. In this paper, we examine the nuanced relationship between demographic attributes and socioeconomic biases in LLMs, a crucial yet understudied area of fairness in LLMs. We introduce a novel dataset of one million English sentences to systematically quantify socioeconomic biases across various demographic groups. Our findings reveal pervasive socioeconomic biases in both established models such as GPT-2 and state-of-the-art models like Llama 2 and Falcon. We demonstrate that these biases are significantly amplified when considering intersectionality, with LLMs exhibiting a remarkable capacity to extract multiple demographic attributes from names and then correlate them with specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Layer Normalization · Weight Decay · Attention Dropout · Linear Layer · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Adam · Attention Is All You Need