Characterizing drug mentions in COVID-19 Twitter Chatter
Ramya Tekumalla, Juan M. Banda

TL;DR
This study analyzes Twitter chatter about COVID-19 drugs, highlighting the importance of machine learning and preprocessing to accurately identify drug mentions amid informal language and misspellings.
Contribution
The paper introduces a combined machine learning and automated approach to improve drug mention detection in social media data, addressing challenges of informal language and misspellings.
Findings
Recovered 15% more drug mentions with the proposed methods.
Demonstrated the necessity of preprocessing for social media text analysis.
Showed machine learning complements traditional methods effectively.
Abstract
Since the classification of COVID-19 as a global pandemic, there have been many attempts to treat and contain the virus. Although there is no specific antiviral treatment recommended for COVID-19, there are several drugs that can potentially help with symptoms. In this work, we mined a large twitter dataset of 424 million tweets of COVID-19 chatter to identify discourse around drug mentions. While seemingly a straightforward task, due to the informal nature of language use in Twitter, we demonstrate the need of machine learning alongside traditional automated methods to aid in this task. By applying these complementary methods, we are able to recover almost 15% additional data, making misspelling handling a needed task as a pre-processing step when dealing with social media data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
