CIA_NITT at WNUT-2020 Task 2: Classification of COVID-19 Tweets Using   Pre-trained Language Models

Yandrapati Prakash Babu; Rajagopal Eswari

arXiv:2009.05782·cs.CL·September 15, 2020

CIA_NITT at WNUT-2020 Task 2: Classification of COVID-19 Tweets Using Pre-trained Language Models

Yandrapati Prakash Babu, Rajagopal Eswari

PDF

TL;DR

This paper explores the use of pre-trained language models, including CT-BERT and an ensemble approach, for classifying COVID-19 related tweets, achieving high F1-scores in a shared task.

Contribution

It introduces effective models using pre-trained language models for COVID-19 tweet classification, demonstrating competitive performance.

Findings

01

CT-BERT achieved an F1-score of 88.7%

02

Ensemble of CT-BERT, RoBERTa, and SVM achieved 88.52% F1-score

03

Pre-trained language models are effective for COVID-19 tweet classification

Abstract

This paper presents our models for WNUT 2020 shared task2. The shared task2 involves identification of COVID-19 related informative tweets. We treat this as binary text classification problem and experiment with pre-trained language models. Our first model which is based on CT-BERT achieves F1-score of 88.7% and second model which is an ensemble of CT-BERT, RoBERTa and SVM achieves F1-score of 88.52%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Softmax · Layer Normalization · Weight Decay · Dropout · Linear Warmup With Linear Decay · Dense Connections · Attention Dropout · WordPiece · Multi-Head Attention