Investigating Gender Bias in Turkish Language Models
Orhun Mersin Caglidil, Malte Ostendorff, Georg Rehm

TL;DR
This paper examines gender and ethnic biases in Turkish language models, extending existing bias evaluation frameworks to a less-studied language and analyzing how model characteristics influence bias levels.
Contribution
It introduces new bias evaluation datasets for Turkish, including gender and ethnic bias tests, and analyzes factors affecting bias in Turkish language models.
Findings
Bias varies with model size and training data
Multilingual models exhibit different bias patterns
Turkish bias datasets are publicly available
Abstract
Language models are trained mostly on Web data, which often contains social stereotypes and biases that the models can inherit. This has potentially negative consequences, as models can amplify these biases in downstream tasks or applications. However, prior research has primarily focused on the English language, especially in the context of gender bias. In particular, grammatically gender-neutral languages such as Turkish are underexplored despite representing different linguistic properties to language models with possibly different effects on biases. In this paper, we fill this research gap and investigate the significance of gender bias in Turkish language models. We build upon existing bias evaluation frameworks and extend them to the Turkish language by translating existing English tests and creating new ones designed to measure gender bias in the context of T\"urkiye.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGender Studies in Language
