HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks
Firoj Alam, Umair Qazi, Muhammad Imran, Ferda Ofli

TL;DR
This paper introduces a large-scale, human-annotated Twitter dataset related to disaster events, along with a data collection pipeline and baseline classification results using deep learning models.
Contribution
It provides a new extensive dataset of 77,000 labeled tweets from 19 disaster events and a sampling pipeline, supporting advanced deep learning research.
Findings
Deep learning models outperform traditional methods on the dataset.
The dataset enables more accurate disaster-related tweet classification.
Baseline results establish a benchmark for future research.
Abstract
Social networks are widely used for information consumption and dissemination, especially during time-critical events such as natural disasters. Despite its significantly large volume, social media content is often too noisy for direct use in any application. Therefore, it is important to filter, categorize, and concisely summarize the available content to facilitate effective consumption and decision-making. To address such issues automatic classification systems have been developed using supervised modeling approaches, thanks to the earlier efforts on creating labeled datasets. However, existing datasets are limited in different aspects (e.g., size, contains duplicates) and less suitable to support more advanced and data-hungry deep learning models. In this paper, we present a new large-scale dataset with ~77K human-labeled tweets, sampled from a pool of ~24 million tweets across 19…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPublic Relations and Crisis Communication · Sentiment Analysis and Opinion Mining · Disaster Management and Resilience
