Evaluating Cultural Awareness of LLMs for Yoruba, Malayalam, and English
Fiifi Dawson, Zainab Mosunmola, Sahil Pocker, Raj Abhijit Dandekar,, Rajat Dandekar, Sreedath Panat

TL;DR
This paper assesses the cultural understanding of large language models for Malayalam and Yoruba using Hofstede's dimensions, revealing high English cultural similarity but gaps in regional cultural nuance comprehension.
Contribution
It introduces a framework to evaluate LLMs' cultural awareness across regional languages using Hofstede's dimensions, highlighting the need for culturally enriched datasets.
Findings
LLMs show high cultural similarity with English.
LLMs fail to capture cultural nuances in Malayalam and Yoruba.
Emphasizes the importance of regional language training datasets.
Abstract
Although LLMs have been extremely effective in a large number of complex tasks, their understanding and functionality for regional languages and cultures are not well studied. In this paper, we explore the ability of various LLMs to comprehend the cultural aspects of two regional languages: Malayalam (state of Kerala, India) and Yoruba (West Africa). Using Hofstede's six cultural dimensions: Power Distance (PDI), Individualism (IDV), Motivation towards Achievement and Success (MAS), Uncertainty Avoidance (UAV), Long Term Orientation (LTO), and Indulgence (IVR), we quantify the cultural awareness of LLM-based responses. We demonstrate that although LLMs show a high cultural similarity for English, they fail to capture the cultural nuances across these 6 metrics for Malayalam and Yoruba. We also highlight the need for large-scale regional language LLM training with culturally enriched…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
