Tracking Dengue Epidemics using Twitter Content Classification and Topic Modelling
Paolo Missier, Alexander Romanovsky, Tudor Miu, Atinder Pal, Michael, Daniilakis, Alessandro Garcia, Diego Cedrim, Leonardo da Silva Sousa

TL;DR
This paper compares supervised classification and unsupervised topic modelling approaches for detecting Dengue-related content on Twitter, highlighting their respective strengths and limitations in epidemic monitoring.
Contribution
It provides a comparative analysis of classification and topic modelling methods for social media health surveillance during Dengue outbreaks.
Findings
Classifier achieves ~80% accuracy with small training data
Topic modelling scales well with larger datasets
Each method has distinct advantages and drawbacks
Abstract
Detecting and preventing outbreaks of mosquito-borne diseases such as Dengue and Zika in Brasil and other tropical regions has long been a priority for governments in affected areas. Streaming social media content, such as Twitter, is increasingly being used for health vigilance applications such as flu detection. However, previous work has not addressed the complexity of drastic seasonal changes on Twitter content across multiple epidemic outbreaks. In order to address this gap, this paper contrasts two complementary approaches to detecting Twitter content that is relevant for Dengue outbreak detection, namely supervised classification and unsupervised clustering using topic modelling. Each approach has benefits and shortcomings. Our classifier achieves a prediction accuracy of about 80\% based on a small training set of about 1,000 instances, but the need for manual annotation makes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Complex Network Analysis Techniques · Human Mobility and Location-Based Analysis
