Speaking Multiple Languages Affects the Moral Bias of Language Models
Katharina H\"ammerl, Bj\"orn Deiseroth, Patrick Schramowski,, Jind\v{r}ich Libovick\'y, Constantin A. Rothkopf, Alexander Fraser, Kristian, Kersting

TL;DR
This paper investigates how pre-trained multilingual language models encode moral norms across different languages, revealing that they reflect language-specific moral biases that do not always align with human cultural differences.
Contribution
The study applies the MoralDirection framework to multilingual models, analyzing moral biases across languages and comparing model responses with human moral judgments.
Findings
Models encode language-specific moral biases.
Moral biases in models do not always match human cultural differences.
Multilingual models exhibit varying moral norms across languages.
Abstract
Pre-trained multilingual language models (PMLMs) are commonly used when dealing with data from multiple languages and cross-lingual transfer. However, PMLMs are trained on varying amounts of data for each language. In practice this means their performance is often much better on English than many other languages. We explore to what extent this also applies to moral norms. Do the models capture moral norms from English and impose them on other languages? Do the models exhibit random and thus potentially harmful beliefs in certain languages? Both these issues could negatively impact cross-lingual transfer and potentially lead to harmful outcomes. In this paper, we (1) apply the MoralDirection framework to multilingual models, comparing results in German, Czech, Arabic, Chinese, and English, (2) analyse model behaviour on filtered parallel subtitles corpora, and (3) apply the models to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Interpreting and Communication in Healthcare
