Text Augmentations with R-drop for Classification of Tweets Self Reporting Covid-19
Sumam Francis, Marie-Francine Moens

TL;DR
This paper introduces a classification approach for Covid-19 self-report tweets using data augmentation and R-drop to improve accuracy, achieving an F1 score of 0.877.
Contribution
The study presents a novel combination of textual augmentations and R-drop regularization for Covid-19 tweet classification.
Findings
Achieved an F1 score of 0.877 on the test set.
Outperformed the task mean and median scores.
Effective use of synonym substitution, reserved words, and back translations.
Abstract
This paper presents models created for the Social Media Mining for Health 2023 shared task. Our team addressed the first task, classifying tweets that self-report Covid-19 diagnosis. Our approach involves a classification model that incorporates diverse textual augmentations and utilizes R-drop to augment data and mitigate overfitting, boosting model efficacy. Our leading model, enhanced with R-drop and augmentations like synonym substitution, reserved words, and back translations, outperforms the task mean and median scores. Our system achieves an impressive F1 score of 0.877 on the test set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · COVID-19 diagnosis using AI
