Automatically Inferring Gender Associations from Language

Serina Chang; Kathleen McKeown

arXiv:1909.00091·cs.CL·September 4, 2019·1 cites

Automatically Inferring Gender Associations from Language

Serina Chang, Kathleen McKeown

PDF

Open Access

TL;DR

This paper introduces a method to automatically infer gender associations from language, revealing domain-dependent differences in how women and men are discussed, with strong performance over baselines.

Contribution

It presents two new datasets and a novel approach for identifying and labeling gender-related semantic clusters in language data.

Findings

01

Large-scale gendered language differences across domains

02

Method outperforms baseline models in human evaluations

03

Differences vary between celebrity news and academic reviews

Abstract

In this paper, we pose the question: do people talk about women and men in different ways? We introduce two datasets and a novel integration of approaches for automatically inferring gender associations from language, discovering coherent word clusters, and labeling the clusters for the semantic concepts they represent. The datasets allow us to compare how people write about women and men in two different settings - one set draws from celebrity news and the other from student reviews of computer science professors. We demonstrate that there are large-scale differences in the ways that people talk about women and men and that these differences vary across domains. Human evaluations show that our methods significantly outperform strong baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Authorship Attribution and Profiling