Investigating Cultural Alignment of Large Language Models

Badr AlKhamissi; Muhammad ElNokrashy; Mai AlKhamissi; Mona Diab

arXiv:2402.13231·cs.CL·July 9, 2024·1 cites

Investigating Cultural Alignment of Large Language Models

Badr AlKhamissi, Muhammad ElNokrashy, Mai AlKhamissi, Mona Diab

PDF

Open Access 1 Repo

TL;DR

This study examines how well large language models reflect cultural diversity, showing that their cultural alignment improves with language-specific prompts and data, and introducing anthropological prompting to enhance this alignment.

Contribution

The paper introduces Anthropological Prompting, a novel method to improve cultural alignment in large language models, and highlights the importance of balanced multilingual pretraining datasets.

Findings

01

Models align better with dominant cultural language prompts.

02

Cultural misalignment increases for underrepresented personas.

03

Anthropological Prompting improves cultural understanding in models.

Abstract

The intricate relationship between language and culture has long been a subject of exploration within the realm of linguistic anthropology. Large Language Models (LLMs), promoted as repositories of collective human knowledge, raise a pivotal question: do these models genuinely encapsulate the diverse knowledge adopted by different cultures? Our study reveals that these models demonstrate greater cultural alignment along two dimensions -- firstly, when prompted with the dominant language of a specific culture, and secondly, when pretrained with a refined mixture of languages employed by that culture. We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references. Specifically, we replicate a survey conducted in various regions of Egypt and the United States through prompting LLMs with different pretraining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bkhmsi/cultural-trends
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods