Semi-Supervised Anomaly Detection for the Determination of Vehicle   Hijacking Tweets

Taahir Aiyoob Patel; Clement N. Nyirenda

arXiv:2308.10036·cs.LG·August 22, 2023

Semi-Supervised Anomaly Detection for the Determination of Vehicle Hijacking Tweets

Taahir Aiyoob Patel, Clement N. Nyirenda

PDF

Open Access

TL;DR

This paper introduces a semi-supervised approach using anomaly detection algorithms to identify vehicle hijacking incidents from tweets, achieving high accuracy and F1-scores, with CBLOF slightly outperforming KNN.

Contribution

The work presents a novel semi-supervised method combining TF-IDF with anomaly detection algorithms for hijacking tweet detection, demonstrating effectiveness over traditional approaches.

Findings

01

CBLOF achieved 90% accuracy and 0.8 F1-score

02

KNN achieved 89% accuracy and 0.78 F1-score

03

CBLOF was identified as the preferred method

Abstract

In South Africa, there is an ever-growing issue of vehicle hijackings. This leads to travellers constantly being in fear of becoming a victim to such an incident. This work presents a new semi-supervised approach to using tweets to identify hijacking incidents by using unsupervised anomaly detection algorithms. Tweets consisting of the keyword "hijacking" are obtained, stored, and processed using the term frequency-inverse document frequency (TF-IDF) and further analyzed by using two anomaly detection algorithms: 1) K-Nearest Neighbour (KNN); 2) Cluster Based Outlier Factor (CBLOF). The comparative evaluation showed that the KNN method produced an accuracy of 89%, whereas the CBLOF produced an accuracy of 90%. The CBLOF method was also able to obtain a F1-Score of 0.8, whereas the KNN produced a 0.78. Therefore, there is a slight difference between the two approaches, in favour of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Data-Driven Disease Surveillance