Towards Lexical Gender Inference: A Scalable Methodology using Online   Databases

Marion Bartl; Susan Leavy

arXiv:2206.14055·cs.CL·June 29, 2022

Towards Lexical Gender Inference: A Scalable Methodology using Online Databases

Marion Bartl, Susan Leavy

PDF

1 Repo

TL;DR

This paper introduces a scalable, dictionary-based method for automatically detecting lexical gender in large language datasets, enabling dynamic and high-coverage analysis of gender bias.

Contribution

The authors propose a novel automated approach for lexical gender detection that overcomes limitations of manual lexicon compilation, providing up-to-date and comprehensive gender identification.

Findings

01

Achieves over 80% accuracy in lexical gender detection

02

Effective on Wikipedia samples and previous research word lists

03

Addresses static and subjective limitations of manual lexicons

Abstract

This paper presents a new method for automatically detecting words with lexical gender in large-scale language datasets. Currently, the evaluation of gender bias in natural language processing relies on manually compiled lexicons of gendered expressions, such as pronouns ('he', 'she', etc.) and nouns with lexical gender ('mother', 'boyfriend', 'policewoman', etc.). However, manual compilation of such lists can lead to static information if they are not periodically updated and often involve value judgments by individual annotators and researchers. Moreover, terms not included in the list fall out of the range of analysis. To address these issues, we devised a scalable, dictionary-based method to automatically detect lexical gender that can provide a dynamic, up-to-date analysis with high coverage. Our approach reaches over 80% accuracy in determining the lexical gender of nouns…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marionbartl/lexical-gender
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.