The Echoes of the 'I': Tracing Identity with Demographically Enhanced Word Embeddings
Ivan Smirnov

TL;DR
This paper presents a novel method that enhances word embeddings with demographic data to empirically study social identity, successfully reproducing gender-related self-view findings and enabling broader social group analysis.
Contribution
It introduces a new approach to incorporate socio-demographic information into word embeddings for social science research, bridging computational methods and social identity theory.
Findings
Successfully reproduces gendered self-view findings
Enables analysis of social group differences
Applicable to social media data
Abstract
Identity is one of the most commonly studied constructs in social science. However, despite extensive theoretical work on identity, there remains a need for additional empirical data to validate and refine existing theories. This paper introduces a novel approach to studying identity by enhancing word embeddings with socio-demographic information. As a proof of concept, we demonstrate that our approach successfully reproduces and extends established findings regarding gendered self-views. Our methodology can be applied in a wide variety of settings, allowing researchers to tap into a vast pool of naturally occurring data, such as social media posts. Unlike similar methods already introduced in computer science, our approach allows for the study of differences between social groups. This could be particularly appealing to social scientists and may encourage the faster adoption of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultilingual Education and Policy
