Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study
Calvin Yixiang Cheng, Scott A Hale

TL;DR
This paper evaluates methods for measuring moral foundations in Chinese texts, finding that multilingual models and large language models outperform translation and lexicon approaches, but human validation remains essential for cultural nuance accuracy.
Contribution
It demonstrates the effectiveness of multilingual models and LLMs for cross-language moral foundation measurement, highlighting their advantages over translation-based methods.
Findings
Multilingual models and LLMs perform reliably across languages.
Translation and lexicon methods often lose cultural information.
Human-in-the-loop validation is crucial for nuanced assessments.
Abstract
This study explores computational approaches for measuring moral foundations (MFs) in non-English corpora. Since most resources are developed primarily for English, cross-linguistic applications of moral foundation theory remain limited. Using Chinese as a case study, this paper evaluates the effectiveness of applying English resources to machine translated text, local language lexicons, multilingual language models, and large language models (LLMs) in measuring MFs in non-English texts. The results indicate that machine translation and local lexicon approaches are insufficient for complex moral assessments, frequently resulting in a substantial loss of cultural information. In contrast, multilingual models and LLMs demonstrate reliable cross-language performance with transfer learning, with LLMs excelling in terms of data efficiency. Importantly, this study also underscores the need…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining
