CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge
Yu Ying Chiu, Liwei Jiang, Maria Antoniak, Chan Young Park, Shuyue, Stella Li, Mehar Bhatia, Sahithya Ravi, Yulia Tsvetkov, Vered Shwartz, Yejin, Choi

TL;DR
CulturalTeaming is an interactive AI-assisted system that enhances the creation of multicultural evaluation datasets for LLMs by combining human expertise and AI support, revealing significant gaps in LLMs' multicultural knowledge.
Contribution
The paper introduces CulturalTeaming, a novel human-AI collaborative system for generating challenging multicultural evaluation datasets for LLMs, improving annotation quality and diversity.
Findings
CulturalTeaming's AI assistance helps create more challenging cultural questions.
Participants felt more creative and confident with increased AI support.
Modern LLMs' accuracy on CULTURALBENCH-V0.1 ranges from 37.7% to 72.2%, indicating a knowledge gap.
Abstract
Frontier large language models (LLMs) are developed by researchers and practitioners with skewed cultural backgrounds and on datasets with skewed sources. However, LLMs' (lack of) multicultural knowledge cannot be effectively assessed with current methods for developing benchmarks. Existing multicultural evaluations primarily rely on expensive and restricted human annotations or potentially outdated internet resources. Thus, they struggle to capture the intricacy, dynamics, and diversity of cultural norms. LLM-generated benchmarks are promising, yet risk propagating the same biases they are meant to measure. To synergize the creativity and expert cultural knowledge of human annotators and the scalability and standardizability of LLM-based automation, we introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build truly challenging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
