Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts

Christina Chance; Da Yin; Dakuo Wang; Kai-Wei Chang

arXiv:2310.10865·cs.CL·April 22, 2026·1 cites

Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts

Christina Chance, Da Yin, Dakuo Wang, Kai-Wei Chang

PDF

1 Video

TL;DR

This study examines how language models' understanding of fairytale stories is influenced by gender stereotypes and demonstrates that counterfactual training can improve model robustness and inclusivity.

Contribution

It introduces a method of using gender perturbations and counterfactual data augmentation to analyze and reduce gender bias in story comprehension models.

Findings

01

Models show slight performance drops with gender perturbations.

02

Counterfactual training improves model robustness to stereotypes.

03

Inclusion of anti-stereotype examples enhances fairness in downstream tasks.

Abstract

In this paper, we study whether language models are affected by learned gender stereotypes during the comprehension of stories. Specifically, we investigate how models respond to gender stereotype perturbations through counterfactual data augmentation. Focusing on Question Answering (QA) tasks in fairytales, we modify the FairytaleQA dataset by swapping gendered character information and introducing counterfactual gender stereotypes during training. This allows us to assess model robustness and examine whether learned biases influence story comprehension. Our results show that models exhibit slight performance drops when faced with gender perturbations in the test set, indicating sensitivity to learned stereotypes. However, when fine-tuned on counterfactual training data, models become more robust to anti-stereotypical narratives. Additionally, we conduct a case study demonstrating how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts· underline