MISGENDERED: Limits of Large Language Models in Understanding Pronouns

Tamanna Hossain; Sunipa Dev; Sameer Singh

arXiv:2306.03950·cs.CL·July 10, 2023·1 cites

MISGENDERED: Limits of Large Language Models in Understanding Pronouns

Tamanna Hossain, Sunipa Dev, Sameer Singh

PDF

Open Access

TL;DR

This paper evaluates large language models' ability to correctly use non-binary and neo-pronouns, revealing significant limitations and proposing a framework for assessment and improvement.

Contribution

It introduces MISGENDERED, a comprehensive evaluation framework for pronoun usage in language models, highlighting their poor performance and the need for better representation of non-binary pronouns.

Findings

01

Models perform poorly on neo-pronouns (avg 7.7%)

02

Models perform poorly on gender-neutral pronouns (avg 34.2%)

03

Few-shot prompts improve accuracy up to 64.7%

Abstract

Content Warning: This paper contains examples of misgendering and erasure that could be offensive and potentially triggering. Gender bias in language technologies has been widely studied, but research has mostly been restricted to a binary paradigm of gender. It is essential also to consider non-binary gender identities, as excluding them can cause further harm to an already marginalized group. In this paper, we comprehensively evaluate popular language models for their ability to correctly use English gender-neutral pronouns (e.g., singular they, them) and neo-pronouns (e.g., ze, xe, thon) that are used by individuals whose gender identity is not represented by binary pronouns. We introduce MISGENDERED, a framework for evaluating large language models' ability to correctly use preferred pronouns, consisting of (i) instances declaring an individual's pronoun, followed by a sentence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Text Readability and Simplification