Measuring Harmful Representations in Scandinavian Language Models

Samia Touileb; Debora Nozza

arXiv:2211.11678·cs.CL·November 22, 2022·1 cites

Measuring Harmful Representations in Scandinavian Language Models

Samia Touileb, Debora Nozza

PDF

Open Access 1 Repo

TL;DR

This study investigates gender-based harmful and toxic content in Scandinavian language models, revealing they contain stereotypes despite the region's reputation for gender equality, highlighting potential risks in real-world applications.

Contribution

The paper introduces a systematic probing method to evaluate harmful content in Scandinavian language models across Danish, Swedish, and Norwegian.

Findings

01

Models contain gender stereotypes similar across languages

02

Harmful content exists despite regional gender equality reputation

03

Results highlight risks of deploying these models in practice

Abstract

Scandinavian countries are perceived as role-models when it comes to gender equality. With the advent of pre-trained language models and their widespread usage, we investigate to what extent gender-based harmful and toxic content exist in selected Scandinavian language models. We examine nine models, covering Danish, Swedish, and Norwegian, by manually creating template-based sentences and probing the models for completion. We evaluate the completions using two methods for measuring harmful and toxic completions and provide a thorough analysis of the results. We show that Scandinavian pre-trained language models contain harmful and gender-based stereotypes with similar values across all languages. This finding goes against the general expectations related to gender equality in Scandinavian countries and shows the possible problematic outcomes of using such models in real-world settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samiatouileb/scandinavianhonest
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection