Investigating Cultural Alignment of Large Language Models
Badr AlKhamissi, Muhammad ElNokrashy, Mai AlKhamissi, Mona Diab

TL;DR
This study examines how well large language models reflect cultural diversity, showing that their cultural alignment improves with language-specific prompts and data, and introducing anthropological prompting to enhance this alignment.
Contribution
The paper introduces Anthropological Prompting, a novel method to improve cultural alignment in large language models, and highlights the importance of balanced multilingual pretraining datasets.
Findings
Models align better with dominant cultural language prompts.
Cultural misalignment increases for underrepresented personas.
Anthropological Prompting improves cultural understanding in models.
Abstract
The intricate relationship between language and culture has long been a subject of exploration within the realm of linguistic anthropology. Large Language Models (LLMs), promoted as repositories of collective human knowledge, raise a pivotal question: do these models genuinely encapsulate the diverse knowledge adopted by different cultures? Our study reveals that these models demonstrate greater cultural alignment along two dimensions -- firstly, when prompted with the dominant language of a specific culture, and secondly, when pretrained with a refined mixture of languages employed by that culture. We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references. Specifically, we replicate a survey conducted in various regions of Egypt and the United States through prompting LLMs with different pretraining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
