Speaking Multiple Languages Affects the Moral Bias of Language Models

Katharina H\"ammerl; Bj\"orn Deiseroth; Patrick Schramowski,; Jind\v{r}ich Libovick\'y; Constantin A. Rothkopf; Alexander Fraser; Kristian; Kersting

arXiv:2211.07733·cs.CL·June 2, 2023

Speaking Multiple Languages Affects the Moral Bias of Language Models

Katharina H\"ammerl, Bj\"orn Deiseroth, Patrick Schramowski,, Jind\v{r}ich Libovick\'y, Constantin A. Rothkopf, Alexander Fraser, Kristian, Kersting

PDF

Open Access 1 Repo

TL;DR

This paper investigates how pre-trained multilingual language models encode moral norms across different languages, revealing that they reflect language-specific moral biases that do not always align with human cultural differences.

Contribution

The study applies the MoralDirection framework to multilingual models, analyzing moral biases across languages and comparing model responses with human moral judgments.

Findings

01

Models encode language-specific moral biases.

02

Moral biases in models do not always match human cultural differences.

03

Multilingual models exhibit varying moral norms across languages.

Abstract

Pre-trained multilingual language models (PMLMs) are commonly used when dealing with data from multiple languages and cross-lingual transfer. However, PMLMs are trained on varying amounts of data for each language. In practice this means their performance is often much better on English than many other languages. We explore to what extent this also applies to moral norms. Do the models capture moral norms from English and impose them on other languages? Do the models exhibit random and thus potentially harmful beliefs in certain languages? Both these issues could negatively impact cross-lingual transfer and potentially lead to harmful outcomes. In this paper, we (1) apply the MoralDirection framework to multilingual models, comparing results in German, Czech, Arabic, Chinese, and English, (2) analyse model behaviour on filtered parallel subtitles corpora, and (3) apply the models to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kathyhaem/multiling-moral-bias
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Interpreting and Communication in Healthcare