Metaphors We Compute By: A Computational Audit of Cultural Translation vs. Thinking in LLMs
Yuan Chang, Jiaming Qu, Zhu Li

TL;DR
This paper critically examines whether large language models genuinely perform culture-aware reasoning or simply act as cultural translators, revealing limitations in their cultural inclusivity through a metaphor generation task.
Contribution
It introduces a computational audit method to evaluate cultural reasoning in LLMs, highlighting their stereotyped and Western-default metaphor usage.
Findings
LLMs show stereotyped metaphor use in different cultural contexts.
Prompting LLMs with cultural identities does not ensure cultural reasoning.
Models tend to default to Western conceptual frameworks.
Abstract
Large language models (LLMs) are often described as multilingual because they can understand and respond in many languages. However, speaking a language is not the same as reasoning within a culture. This distinction motivates a critical question: do LLMs truly conduct culture-aware reasoning? This paper presents a preliminary computational audit of cultural inclusivity in a creative writing task. We empirically examine whether LLMs act as culturally diverse creative partners or merely as cultural translators that leverage a dominant conceptual framework with localized expressions. Using a metaphor generation task spanning five cultural settings and several abstract concepts as a case study, we find that the model exhibits stereotyped metaphor usage for certain settings, as well as Western defaultism. These findings suggest that merely prompting an LLM with a cultural identity does not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
