TL;DR
This paper evaluates and improves Dutch coreference resolution systems' handling of gender-neutral pronouns, introducing a new evaluation metric and demonstrating the effectiveness of counterfactual data augmentation in reducing bias.
Contribution
It compares debiasing techniques for Dutch coreference resolution, introduces a novel pronoun-specific evaluation metric, and shows CDA's effectiveness in low-resource and unseen pronoun scenarios.
Findings
CDA significantly reduces bias in pronoun resolution.
Delexicalisation does not improve performance.
CDA remains effective with limited data and unseen pronouns.
Abstract
Gender-neutral pronouns are increasingly being introduced across Western languages. Recent evaluations have however demonstrated that English NLP systems are unable to correctly process gender-neutral pronouns, with the risk of erasing and misgendering non-binary individuals. This paper examines a Dutch coreference resolution system's performance on gender-neutral pronouns, specifically hen and die. In Dutch, these pronouns were only introduced in 2016, compared to the longstanding existence of singular they in English. We additionally compare two debiasing techniques for coreference resolution systems in non-binary contexts: Counterfactual Data Augmentation (CDA) and delexicalisation. Moreover, because pronoun performance can be hard to interpret from a general evaluation metric like LEA, we introduce an innovative evaluation metric, the pronoun score, which directly represents the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
