Cross-Language Domain Adaptation for Classifying Crisis-Related Short Messages
Muhammad Imran, Prasenjit Mitra, Jaideep Srivastava

TL;DR
This paper investigates how past labeled crisis-related tweets, in same or different languages, can be reused to improve real-time classification of messages during disasters, highlighting the benefits and limitations of cross-event and cross-language adaptation.
Contribution
It provides an extensive analysis of the effectiveness of using past disaster data and cross-language data for classifying crisis messages, revealing when such transfer learning is beneficial.
Findings
Past labels are useful when source and target events are similar.
Cross-language adaptation helps for similar languages but not for different languages.
Performance decreases when applying cross-language models across different languages.
Abstract
Rapid crisis response requires real-time analysis of messages. After a disaster happens, volunteers attempt to classify tweets to determine needs, e.g., supplies, infrastructure damage, etc. Given labeled data, supervised machine learning can help classify these messages. Scarcity of labeled data causes poor performance in machine training. Can we reuse old tweets to train classifiers? How can we choose labeled tweets for training? Specifically, we study the usefulness of labeled data of past events. Do labeled tweets in different language help? We observe the performance of our classifiers trained using different combinations of training sets obtained from past disasters. We perform extensive experimentation on real crisis datasets and show that the past labels are useful when both source and target events are of the same type (e.g. both earthquakes). For similar languages (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPublic Relations and Crisis Communication · Seismology and Earthquake Studies · Sentiment Analysis and Opinion Mining
