A Deep Learning Approach for Tweet Classification and Rescue Scheduling   for Effective Disaster Management

Md. Yasin Kabir; Sanjay Madria

arXiv:1908.01456·cs.SI·August 6, 2019

A Deep Learning Approach for Tweet Classification and Rescue Scheduling for Effective Disaster Management

Md. Yasin Kabir, Sanjay Madria

PDF

TL;DR

This paper presents a deep learning model combining attention-based BLSTM and CNN for classifying disaster-related tweets, and an adaptive scheduling algorithm to optimize rescue operations based on tweet priorities.

Contribution

It introduces a novel deep learning framework for tweet classification and a hybrid scheduling algorithm for disaster rescue management.

Findings

01

The proposed model outperforms existing methods in accuracy and F1-score.

02

Effective priority determination improves rescue scheduling efficiency.

03

Robustness tested across multiple disaster datasets.

Abstract

It is a challenging and complex task to acquire information from different regions of a disaster-affected area in a timely fashion. The extensive spread and reach of social media and networks allow people to share information in real-time. However, the processing of social media data and gathering of valuable information require a series of operations such as (1) processing each specific tweet for a text classification, (2) possible location determination of people needing help based on tweets, and (3) priority calculations of rescue tasks based on the classification of tweets. These are three primary challenges in developing an effective rescue scheduling operation using social media data. In this paper, first, we propose a deep learning model combining attention based Bi-directional Long Short-Term Memory (BLSTM) and Convolutional Neural Network (CNN) to classify the tweets under…

Figures10

Click any figure to enlarge with its caption.

Tables12

Table 1. Table 1. Auxiliary Features

polarity, subjectivity, sentiment, wordsVsLength, exclamationMarks, questionMarks, digitVsLength, digitVsWord, punctuationVsLength, punctuationVsWords, nounsVsWwords, sadVsWords, angryVsWords, capitalsWords, capitalsVsWords, uniqueWords, repeatedWords, numberOfHashtags.

Table 2. Table 2. Hyperparameter values

Hyperparameter	Value/Description
Text embedding	Dimension: 300
BLSTM Layer	2 layers; 300 hidden units in each (Forward and Backward)
Conv1D Layer	3 layers; 300 convolution filters
Dense Layer	3 layers; First 2 layers have 150 and 75 units respectively and the last one is output (Dense)
Drop-out rate	Word Embedding: 0.3; Dense layer: 0.2 each;
Activation function	Conv1D, BLSTM, Dense: ReLU; Output Dense layer: Sigmoid;
Adam optimizer	Learning rate = 0.0001; $b e t a_{1}$ =0.9;
Epochs and batch	Epochs = 10 to 25; batch size = 128;

Table 3. Table 3. Classifier evaluation (Hurricane Harvery and Irma)

Model	Precision	Recall	F1-score	Accuracy
LR	55.8	93.0	69.7	84.5
SVM	65.1	85.4	73.9	88.5
CNN	61.6	90.8	73.4	87.5
$C N N_{A A f}$	81.7	93.4	87.2	93.7

Table 4. Table 4. Evaluation metrics for individual classes (Hurricane Harvery and Irma) using C N N A A f 𝐶 𝑁 subscript 𝑁 𝐴 𝐴 𝑓 CNN_{AAf} model

Class	Precision	Recall	F1-score	Accuracy
Help	87.9	97.7	91.2	94.9
Flood	78.2	94.1	85.3	91.3
Water Needed	87.5	71.4	78.7	98.0
DCEW	93.7	73.2	82.3	98.5
Weighted Avg	81.7	93.4	87.2	93.7

Table 5. Table 5. Classifier evaluation AUC scores (CrisisNLP)

Disaster Name	LR	SVM	$C N N_{I}$	$C N N_{A A f}$
Nepal Earthquake	82.6	83.6	84.8	87.5
California Earthquake	75.5	74.7	78.3	83.6
Typhoon Hagupit	75.9	77.64	85.8	88.3
Cyclone PAM	90.6	90.74	92.6	92.6

Table 6. Table 6. Used Datasets from CrisisLex

2012 Colorado wildfires, 2012 Costa Rica earthquake, 2012 Guatemala earthquake, 2012 Italy earthquakes, 2012 Philipinnes floods, 2012 Typhoon Pablo, 2012 Venezuela refinery, 2013 Alberta floods, 2013 Australia bushfire, 2013 Bohol earthquake, 2013 Colorado floods, 2013 Manila floods, 2013 Queensland floods, 2013 Sardinia floods, and 2013 Typhoon Yolanda.

Table 7. Table 7. Classifier evaluation (CrisisLex)

Model	Precision	Recall	F1-score	Accuracy
LR	85.8	71.1	77.8	85.8
SVM	90.9	74.7	82.1	73.2
CNN	93.4	76.3	84.2	76.4
$C N N_{A A f}$	93.6	93.7	93.4	93.6

Table 8. Table 8. Average waiting time summary

Algorithms	Max avg WT		Mean avg WT
	10p	20p	10p	20p
FCFS	4.74	3.73	2.53	1.61
Priority	5.54	3.85	2.81	1.63
Multi-tasks Hybrid	4.47	3.02	2.24	1.31

Table 9. Table 9. Classified tweet labels for priority determination

id

Flood

Water

Needed

DCEW

Sick or

Injured

1

0

1

2

1

0

3

0

1

0

4

1

0

1

5

0

Table 10. Table 10. Environmental features example

id	Current		Forecast
	Storm	Road Damaged	Storm	Flood
1	0	1	1	0
2	0	0	1	0
3	0	1	0	1
4	0	1	0	0
5	1	0	0	0

Table 11. Table 11. Real-world data sample for simulation

taskSeq

Arrival

Time

Burst

Time

Priority

Score

Distance

from Base

1