How Different Is Stereotypical Bias Across Languages?
Ibrahim Tolga \"Ozt\"urk, Rostislav Nedelchev, Christian Heumann, and Esteban Garces Arias, Marius Roger, Bernd Bischl, Matthias, A{\ss}enmacher

TL;DR
This study investigates stereotypical biases in multilingual pre-trained language models across several languages, revealing nuanced differences and surprising anti-stereotypical behaviors, and emphasizes the importance of multilingual analysis.
Contribution
It extends bias analysis to multiple languages and architectures, providing new insights into cross-lingual stereotypes in pre-trained models.
Findings
English models exhibit the strongest bias.
Turkish models show the least stereotypical stereotypes.
mGPT-2 displays anti-stereotypical behavior across languages.
Abstract
Recent studies have demonstrated how to assess the stereotypical bias in pre-trained English language models. In this work, we extend this branch of research in multiple different dimensions by systematically investigating (a) mono- and multilingual models of (b) different underlying architectures with respect to their bias in (c) multiple different languages. To that end, we make use of the English StereoSet data set (Nadeem et al., 2021), which we semi-automatically translate into German, French, Spanish, and Turkish. We find that it is of major importance to conduct this type of analysis in a multilingual setting, as our experiments show a much more nuanced picture as well as notable differences from the English-only analysis. The main takeaways from our analysis are that mGPT-2 (partly) shows surprising anti-stereotypical behavior across languages, English (monolingual) models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
