How Well Do LLMs Identify Cultural Unity in Diversity?

Jialin Li; Junli Wang; Junjie Hu; Ming Jiang

arXiv:2408.05102·cs.CL·August 12, 2024

How Well Do LLMs Identify Cultural Unity in Diversity?

Jialin Li, Junli Wang, Junjie Hu, Ming Jiang

PDF

1 Repo

TL;DR

This paper introduces a new benchmark dataset, CUNIT, to evaluate how well large language models understand the shared cultural concepts across different countries, revealing current limitations in their cultural awareness.

Contribution

The study presents CUNIT, a comprehensive dataset for assessing LLMs' understanding of cultural unity, and systematically evaluates LLMs' ability to identify cross-cultural concept associations.

Findings

01

LLMs show limited ability to capture cross-cultural concept associations.

02

Cultural associations vary significantly across different concept categories.

03

Geo-cultural proximity has minimal impact on LLMs' performance in understanding cultural similarities.

Abstract

Much work on the cultural awareness of large language models (LLMs) focuses on the models' sensitivity to geo-cultural diversity. However, in addition to cross-cultural differences, there also exists common ground across cultures. For instance, a bridal veil in the United States plays a similar cultural-relevant role as a honggaitou in China. In this study, we introduce a benchmark dataset CUNIT for evaluating decoder-only LLMs in understanding the cultural unity of concepts. Specifically, CUNIT consists of 1,425 evaluation examples building upon 285 traditional cultural-specific concepts across 10 countries. Based on a systematic manual annotation of cultural-relevant features per concept, we calculate the cultural association between any pair of cross-cultural concepts. Built upon this dataset, we design a contrastive matching task to evaluate the LLMs' capability to identify highly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ljl0222/CUNIT
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.