Quantifying Intimacy in Language
Jiaxin Pei, David Jurgens

TL;DR
This paper introduces a computational framework and dataset for quantifying intimacy in language, revealing how social norms influence linguistic expressions across various contexts with high predictive accuracy.
Contribution
It presents a novel deep learning model and large dataset to measure intimacy in language, linking computational analysis with social psychology insights.
Findings
High correlation (r=0.87) in predicting intimacy levels.
Individuals modulate language to align with social norms.
Linguistic cues reflect social relationships and audience considerations.
Abstract
Intimacy is a fundamental aspect of how we relate to others in social settings. Language encodes the social information of intimacy through both topics and other more subtle cues (such as linguistic hedging and swearing). Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson's r=0.87). Through analyzing a dataset of 80.5M questions across social media, books, and films, we show that individuals employ interpersonal pragmatic moves in their language to align their intimacy with social settings. Then, in three studies, we further demonstrate how individuals modulate their intimacy to match social norms around gender, social distance, and audience, each validating key findings from studies in social psychology. Our work…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Digital Communication and Language
