TL;DR
This paper presents a straightforward method using fine-tuned Distill-BERT to classify COVID-19 related tweets for self-reporting symptoms and context, improving accuracy in social media health data analysis.
Contribution
The study demonstrates effective fine-tuning strategies of Distill-BERT for two COVID-19 tweet classification tasks, including the impact of cross-task fine-tuning.
Findings
Fine-tuning Distill-BERT improves classification accuracy.
Cross-task fine-tuning enhances model performance.
The approach is effective for social media health data extraction.
Abstract
We describe our straight-forward approach for Tasks 5 and 6 of 2021 Social Media Mining for Health Applications (SMM4H) shared tasks. Our system is based on fine-tuning Distill- BERT on each task, as well as first fine-tuning the model on the other task. We explore how much fine-tuning is necessary for accurately classifying tweets as containing self-reported COVID-19 symptoms (Task 5) or whether a tweet related to COVID-19 is self-reporting, non-personal reporting, or a literature/news mention of the virus (Task 6).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Weight Decay · Layer Normalization · Linear Warmup With Linear Decay · Attention Dropout · WordPiece · Residual Connection
