Adaptation of domain-specific transformer models with text oversampling   for sentiment analysis of social media posts on Covid-19 vaccines

Anmol Bansal; Arjun Choudhry; Anubhav Sharma; Seba Susan

arXiv:2209.10966·cs.CL·January 16, 2023·1 cites

Adaptation of domain-specific transformer models with text oversampling for sentiment analysis of social media posts on Covid-19 vaccines

Anmol Bansal, Arjun Choudhry, Anubhav Sharma, Seba Susan

PDF

Open Access 1 Repo

TL;DR

This paper investigates the effectiveness of domain-specific transformer models and text oversampling techniques in improving sentiment analysis accuracy on small, imbalanced datasets of Covid-19 vaccine-related social media posts.

Contribution

It introduces the use of LMOTE oversampling with domain-specific transformers like CT-BERT and BERTweet for enhanced sentiment classification on Covid-19 tweets.

Findings

01

Text oversampling improves model accuracy on small, imbalanced datasets.

02

Domain-specific transformers outperform general models in sentiment analysis.

03

Oversampling is particularly effective for minority sentiment classes.

Abstract

Covid-19 has spread across the world and several vaccines have been developed to counter its surge. To identify the correct sentiments associated with the vaccines from social media posts, we fine-tune various state-of-the-art pre-trained transformer models on tweets associated with Covid-19 vaccines. Specifically, we use the recently introduced state-of-the-art pre-trained transformer models RoBERTa, XLNet and BERT, and the domain-specific transformer models CT-BERT and BERTweet that are pre-trained on Covid-19 tweets. We further explore the option of text augmentation by oversampling using Language Model based Oversampling Technique (LMOTE) to improve the accuracies of these models, specifically, for small sample datasets where there is an imbalanced class distribution among the positive, negative and neutral sentiment classes. Our results summarize our findings on the suitability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ace117mc/transformer-models-covid
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Vaccine Coverage and Hesitancy

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Weight Decay · Attention Dropout · Residual Connection · Layer Normalization · Dense Connections · Multi-Head Attention · WordPiece