DiPD: Disruptive event Prediction Dataset from Twitter
Sanskar Soni, Dev Mehta, Vinush Vishwanath, Aditi Seetha, Satyendra, Singh Chouhan

TL;DR
This paper introduces DiPD, a comprehensive Twitter dataset for predicting disruptive events like riots and protests, aiming to enable early detection and mitigation through machine learning models.
Contribution
The creation of a large, labeled Twitter dataset specifically designed for predicting disruptive events, including feature extraction for improved model performance.
Findings
Dataset contains 263,561 records with event and non-event tweets.
Features include user follower count and location, aiding impact analysis.
Potential to improve early warning systems for social disruptions.
Abstract
Riots and protests, if gone out of control, can cause havoc in a country. We have seen examples of this, such as the BLM movement, climate strikes, CAA Movement, and many more, which caused disruption to a large extent. Our motive behind creating this dataset was to use it to develop machine learning systems that can give its users insight into the trending events going on and alert them about the events that could lead to disruption in the nation. If any event starts going out of control, it can be handled and mitigated by monitoring it before the matter escalates. This dataset collects tweets of past or ongoing events known to have caused disruption and labels these tweets as 1. We also collect tweets that are considered non-eventful and label them as 0 so that they can also be used to train a classification system. The dataset contains 94855 records of unique events and 168706…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Complex Network Analysis Techniques · Sentiment Analysis and Opinion Mining
