Text Augmentations with R-drop for Classification of Tweets Self   Reporting Covid-19

Sumam Francis; Marie-Francine Moens

arXiv:2311.03420·cs.CL·November 8, 2023·1 cites

Text Augmentations with R-drop for Classification of Tweets Self Reporting Covid-19

Sumam Francis, Marie-Francine Moens

PDF

Open Access

TL;DR

This paper introduces a classification approach for Covid-19 self-report tweets using data augmentation and R-drop to improve accuracy, achieving an F1 score of 0.877.

Contribution

The study presents a novel combination of textual augmentations and R-drop regularization for Covid-19 tweet classification.

Findings

01

Achieved an F1 score of 0.877 on the test set.

02

Outperformed the task mean and median scores.

03

Effective use of synonym substitution, reserved words, and back translations.

Abstract

This paper presents models created for the Social Media Mining for Health 2023 shared task. Our team addressed the first task, classifying tweets that self-report Covid-19 diagnosis. Our approach involves a classification model that incorporates diverse textual augmentations and utilizes R-drop to augment data and mitigate overfitting, boosting model efficacy. Our leading model, enhanced with R-drop and augmentations like synonym substitution, reserved words, and back translations, outperforms the task mean and median scores. Our system achieves an impressive F1 score of 0.877 on the test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Topic Modeling · COVID-19 diagnosis using AI