Mining the Demographics of Political Sentiment from Twitter Using   Learning from Label Proportions

Ehsan Mohammady Ardehaly; Aron Culotta

arXiv:1708.08000·cs.SI·January 1, 2018

Mining the Demographics of Political Sentiment from Twitter Using Learning from Label Proportions

Ehsan Mohammady Ardehaly, Aron Culotta

PDF

TL;DR

This paper introduces a scalable learning model that infers political sentiment and demographics from Twitter data by leveraging population-level data, reducing the need for costly individual annotations, and closely tracking traditional polls.

Contribution

The paper proposes Weighted Label Regularization, a novel LLP model that supports hierarchical sample weighting, enabling demographic and opinion inference from social media data without individual labels.

Findings

01

Model achieves 28-44% error reduction compared to baselines.

02

Estimates align closely with traditional polling data.

03

Demonstrates ability to analyze linguistic and demographic interactions over time.

Abstract

Opinion mining and demographic attribute inference have many applications in social science. In this paper, we propose models to infer daily joint probabilities of multiple latent attributes from Twitter data, such as political sentiment and demographic attributes. Since it is costly and time-consuming to annotate data for traditional supervised classification, we instead propose scalable Learning from Label Proportions (LLP) models for demographic and opinion inference using U.S. Census, national and state political polls, and Cook partisan voting index as population level data. In LLP classification settings, the training data is divided into a set of unlabeled bags, where only the label distribution in of each bag is known, removing the requirement of instance-level annotations. Our proposed LLP model, Weighted Label Regularization (WLR), provides a scalable generalization of prior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.