Is She Even Relevant? When BERT Ignores Explicit Gender Cues
Jonas Klein, Chiara Manna, Eva Vanmassenhove

TL;DR
This study analyzes how gender bias emerges and persists in a Dutch BERT model, revealing that explicit gender cues often fail to override learned stereotypes, leading to male-default interpretations.
Contribution
It provides one of the first checkpoint-level analyses of gender bias formation in a Dutch Transformer model, highlighting limitations in dynamic contextualization of gender.
Findings
Gender becomes linearly separable around epoch 20.
Model struggles to update gender representations with explicit cues.
Dutch BERT defaults to male interpretations even with female context.
Abstract
Gender bias in large language models has primarily been investigated for English, while languages with grammatical or morphological gender remain comparatively understudied. This paper investigates how and when gender information emerges in a Dutch BERT model trained from scratch, offering one of the first checkpoint-level analyses of bias formation in a Transformer architecture for a language combining overt morphological gender marking and generic forms. By extracting contextual embeddings throughout training, we construct dynamic gender subspaces using linear SVMs to trace when gender becomes linearly encoded and how this encoding evolves over time. Contextual embeddings are often assumed to integrate contextual cues robustly, allowing models to adjust the representation of a word depending on its more local usage. We therefore test whether explicit gender cues in controlled sentence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
