TL;DR
This study explores the complex relationship between gender identity, linguistic styles, and social networks on Twitter, revealing nuanced gendered language patterns and their social implications through computational and social analysis.
Contribution
It introduces a novel corpus and clustering approach to capture multifaceted gendered language styles and examines how language deviations relate to social network structures.
Findings
Gendered language clusters vary in style and topics.
Individuals with atypical language styles have fewer same-gender social ties.
Social network homophily correlates with gendered language markers.
Abstract
We present a study of the relationship between gender, linguistic style, and social networks, using a novel corpus of 14,000 Twitter users. Prior quantitative work on gender often treats this social variable as a female/male binary; we argue for a more nuanced approach. By clustering Twitter users, we find a natural decomposition of the dataset into various styles and topical interests. Many clusters have strong gender orientations, but their use of linguistic resources sometimes directly conflicts with the population-level language statistics. We view these clusters as a more accurate reflection of the multifaceted nature of gendered language styles. Previous corpus-based work has also had little to say about individuals whose linguistic styles defy population-level gender patterns. To identify such individuals, we train a statistical classifier, and measure the classifier confidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
