Building Knowledge-Guided Lexica to Model Cultural Variation

Shreya Havaldar; Salvatore Giorgi; Sunny Rai; Young-Min Cho; Thomas; Talhelm; Sharath Chandra Guntuku; Lyle Ungar

arXiv:2406.11622·cs.CL·October 15, 2024·1 cites

Building Knowledge-Guided Lexica to Model Cultural Variation

Shreya Havaldar, Salvatore Giorgi, Sunny Rai, Young-Min Cho, Thomas, Talhelm, Sharath Chandra Guntuku, Lyle Ungar

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a scalable method for measuring regional cultural variation through knowledge-guided lexica, addressing previous limitations in data and scalability, and highlights the shortcomings of modern LLMs in capturing cultural diversity.

Contribution

It proposes a novel approach to model cultural variation using knowledge-guided lexica and discusses the limitations of current LLMs in this domain.

Findings

01

Knowledge-guided lexica effectively model regional cultural differences.

02

Modern LLMs fail to capture and generate culturally diverse language.

03

The approach offers a scalable solution for cultural analysis in NLP.

Abstract

Cultural variation exists between nations (e.g., the United States vs. China), but also within regions (e.g., California vs. Texas, Los Angeles vs. San Francisco). Measuring this regional cultural variation can illuminate how and why people think and behave differently. Historically, it has been difficult to computationally model cultural variation due to a lack of training data and scalability constraints. In this work, we introduce a new research problem for the NLP community: How do we measure variation in cultural constructs across regions using language? We then provide a scalable solution: building knowledge-guided lexica to model cultural variation, encouraging future work at the intersection of NLP and cultural understanding. We also highlight modern LLMs' failure to measure cultural variation or generate culturally varied language.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shreyahavaldar/knowledge_guided_lexica
noneOfficial

Videos

Building Knowledge-Guided Lexica to Model Cultural Variation· underline

Taxonomy

TopicsNatural Language Processing Techniques