Stop! In the Name of Flaws: Disentangling Personal Names and   Sociodemographic Attributes in NLP

Vagrant Gautam; Arjun Subramonian; Anne Lauscher; Os Keyes

arXiv:2405.17159·cs.CL·July 16, 2024

Stop! In the Name of Flaws: Disentangling Personal Names and Sociodemographic Attributes in NLP

Vagrant Gautam, Arjun Subramonian, Anne Lauscher, Os Keyes

PDF

Open Access 1 Video

TL;DR

This paper critically examines the use of personal names in NLP to infer sociodemographic attributes, highlighting methodological and ethical challenges, and offers guidelines to improve research practices.

Contribution

It provides an interdisciplinary overview of issues and offers normative recommendations to address validity and ethical concerns in NLP involving names.

Findings

01

Identifies validity issues like systematic error and construct validity.

02

Highlights ethical concerns such as harms and cultural insensitivity.

03

Provides guiding questions and normative recommendations for future research.

Abstract

Personal names simultaneously differentiate individuals and categorize them in ways that are important in a given society. While the natural language processing community has thus associated personal names with sociodemographic characteristics in a variety of tasks, researchers have engaged to varying degrees with the established methodological problems in doing so. To guide future work that uses names and sociodemographic characteristics, we provide an overview of relevant research: first, we present an interdisciplinary background on names and naming. We then survey the issues inherent to associating names with sociodemographic attributes, covering problems of validity (e.g., systematic error, construct validity), as well as ethical concerns (e.g., harms, differential impact, cultural insensitivity). Finally, we provide guiding questions along with normative recommendations to avoid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stop! In the Name of Flaws: Disentangling Personal Names and Sociodemographic Attributes in NLP· underline

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Interpreting and Communication in Healthcare