Adaptation of domain-specific transformer models with text oversampling for sentiment analysis of social media posts on Covid-19 vaccines
Anmol Bansal, Arjun Choudhry, Anubhav Sharma, Seba Susan

TL;DR
This paper investigates the effectiveness of domain-specific transformer models and text oversampling techniques in improving sentiment analysis accuracy on small, imbalanced datasets of Covid-19 vaccine-related social media posts.
Contribution
It introduces the use of LMOTE oversampling with domain-specific transformers like CT-BERT and BERTweet for enhanced sentiment classification on Covid-19 tweets.
Findings
Text oversampling improves model accuracy on small, imbalanced datasets.
Domain-specific transformers outperform general models in sentiment analysis.
Oversampling is particularly effective for minority sentiment classes.
Abstract
Covid-19 has spread across the world and several vaccines have been developed to counter its surge. To identify the correct sentiments associated with the vaccines from social media posts, we fine-tune various state-of-the-art pre-trained transformer models on tweets associated with Covid-19 vaccines. Specifically, we use the recently introduced state-of-the-art pre-trained transformer models RoBERTa, XLNet and BERT, and the domain-specific transformer models CT-BERT and BERTweet that are pre-trained on Covid-19 tweets. We further explore the option of text augmentation by oversampling using Language Model based Oversampling Technique (LMOTE) to improve the accuracies of these models, specifically, for small sample datasets where there is an imbalanced class distribution among the positive, negative and neutral sentiment classes. Our results summarize our findings on the suitability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Vaccine Coverage and Hesitancy
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Weight Decay · Attention Dropout · Residual Connection · Layer Normalization · Dense Connections · Multi-Head Attention · WordPiece
