Analyzing COVID-19 Tweets with Transformer-based Language Models

Philip Feldman; Sim Tiwari; Charissa S. L. Cheah; James R. Foulds,; Shimei Pan

arXiv:2104.10259·cs.CL·May 7, 2021

Analyzing COVID-19 Tweets with Transformer-based Language Models

Philip Feldman, Sim Tiwari, Charissa S. L. Cheah, James R. Foulds,, Shimei Pan

PDF

TL;DR

This paper presents a method using Transformer-based Language Models trained on COVID-19 tweets to analyze public opinion and biases, providing insights similar to polling on social and health issues.

Contribution

It introduces a novel approach of using prompt-based querying of TLMs trained on social media data to understand public opinion and biases.

Findings

01

Transformer models effectively reveal user biases.

02

Models produce polling-like insights on social and health topics.

03

Approach scales to large social media datasets.

Abstract

This paper describes a method for using Transformer-based Language Models (TLMs) to understand public opinion from social media posts. In this approach, we train a set of GPT models on several COVID-19 tweet corpora that reflect populations of users with distinctive views. We then use prompt-based queries to probe these models to reveal insights into the biases and opinions of the users. We demonstrate how this approach can be used to produce results which resemble polling the public on diverse social, political and public health issues. The results on the COVID-19 tweet data show that transformer language models are promising tools that can help us understand public opinions on social media at scale.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Softmax · Attention Dropout · Linear Warmup With Cosine Annealing · Layer Normalization · Residual Connection · Weight Decay