Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of   Crisis-related Messages

Muhammad Imran; Prasenjit Mitra; Carlos Castillo

arXiv:1605.05894·cs.CL·June 1, 2016·144 cites

Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages

Muhammad Imran, Prasenjit Mitra, Carlos Castillo

PDF

Open Access 1 Repo

TL;DR

This paper introduces human-annotated Twitter datasets from 19 crises, along with word embeddings and lexical resources, to improve NLP tasks like classification during emergencies.

Contribution

It provides the largest crisis-related Twitter corpora, new lexical resources, and trained word embeddings to advance NLP applications in disaster response.

Findings

01

Effective classifiers trained on the annotated data.

02

Largest crisis-related Twitter word embeddings created.

03

Normalized lexical resources for noisy social media language.

Abstract

Microblogging platforms such as Twitter provide active communication channels during mass convergence and emergency events such as earthquakes, typhoons. During the sudden onset of a crisis situation, affected people post useful information on Twitter that can be used for situational awareness and other humanitarian disaster response efforts, if processed timely and effectively. Processing social media information pose multiple challenges such as parsing noisy, brief and informal messages, learning information categories from the incoming stream of messages and classifying them into different classes among others. One of the basic necessities of many of these tasks is the availability of data, in particular human-annotated data. In this paper, we present human-annotated Twitter corpora collected during 19 different crises that took place between 2013 and 2015. To demonstrate the utility…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

konstapo/2022-fake-news-mediaeval-task
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPublic Relations and Crisis Communication · Sentiment Analysis and Opinion Mining · Disaster Management and Resilience