LMSOC: An Approach for Socially Sensitive Pretraining

Vivek Kulkarni; Shubhanshu Mishra; Aria Haghighi

arXiv:2110.10319·cs.CL·October 22, 2021

LMSOC: An Approach for Socially Sensitive Pretraining

Vivek Kulkarni, Shubhanshu Mishra, Aria Haghighi

PDF

1 Repo

TL;DR

This paper introduces LMSOC, a method that enhances large-scale language models by incorporating social context representations, significantly improving performance on geographically-sensitive language tasks.

Contribution

The paper presents a novel approach to integrate social context into language model pretraining using graph representation learning, addressing a gap in capturing social nuances.

Findings

01

Over 100% relative improvement on MRR for social language tasks

02

Effective incorporation of geographical and social context into language models

03

Demonstrates the importance of social context in NLP performance

Abstract

While large-scale pretrained language models have been shown to learn effective linguistic representations for many NLP tasks, there remain many real-world contextual aspects of language that current approaches do not capture. For instance, consider a cloze-test "I enjoyed the ____ game this weekend": the correct answer depends heavily on where the speaker is from, when the utterance occurred, and the speaker's broader social milieu and preferences. Although language depends heavily on the geographical, temporal, and other social contexts of the speaker, these elements have not been incorporated into modern transformer-based language models. We propose a simple but effective approach to incorporate speaker social context into the learned representations of large-scale language models. Our method first learns dense representations of social contexts using graph representation learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

twitter-research/lmsoc
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.