Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study

Calvin Yixiang Cheng; Scott A Hale

arXiv:2502.02451·cs.CL·July 23, 2025

Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study

Calvin Yixiang Cheng, Scott A Hale

PDF

Open Access 1 Repo

TL;DR

This paper evaluates methods for measuring moral foundations in Chinese texts, finding that multilingual models and large language models outperform translation and lexicon approaches, but human validation remains essential for cultural nuance accuracy.

Contribution

It demonstrates the effectiveness of multilingual models and LLMs for cross-language moral foundation measurement, highlighting their advantages over translation-based methods.

Findings

01

Multilingual models and LLMs perform reliably across languages.

02

Translation and lexicon methods often lose cultural information.

03

Human-in-the-loop validation is crucial for nuanced assessments.

Abstract

This study explores computational approaches for measuring moral foundations (MFs) in non-English corpora. Since most resources are developed primarily for English, cross-linguistic applications of moral foundation theory remain limited. Using Chinese as a case study, this paper evaluates the effectiveness of applying English resources to machine translated text, local language lexicons, multilingual language models, and large language models (LLMs) in measuring MFs in non-English texts. The results indicate that machine translation and local lexicon approaches are insufficient for complex moral assessments, frequently resulting in a substantial loss of cultural information. In contrast, multilingual models and LLMs demonstrate reliable cross-language performance with transfer learning, with LLMs excelling in terms of data efficiency. Importantly, this study also underscores the need…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

calvinchengyx/cross-lan-mft-measure
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining