Evaluating the Performance of Twitter-based Exploit Detectors
Daniel Alves de Sousa, Elaine Ribeiro de Faria, Rodrigo Sanches, Miani

TL;DR
This paper explores using Twitter data combined with security databases and machine learning to identify exploited vulnerabilities, highlighting the effectiveness of LightGBM and the significance of user and tweet metadata.
Contribution
It introduces a novel approach integrating social media and security data with machine learning for vulnerability exploitation detection, emphasizing the value of ground-truth data and metadata.
Findings
LightGBM improves classification accuracy
User and tweet statistics outperform tweet text
Ground-truth data from security companies enhances results
Abstract
Patch prioritization is a crucial aspect of information systems security, and knowledge of which vulnerabilities were exploited in the wild is a powerful tool to help systems administrators accomplish this task. The analysis of social media for this specific application can enhance the results and bring more agility by collecting data from online discussions and applying machine learning techniques to detect real-world exploits. In this paper, we use a technique that combines Twitter data with public database information to classify vulnerabilities as exploited or not-exploited. We analyze the behavior of different classifying algorithms, investigate the influence of different antivirus data as ground truth, and experiment with various time window sizes. Our findings suggest that using a Light Gradient Boosting Machine (LightGBM) can benefit the results, and for most cases, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Spam and Phishing Detection · Advanced Malware Detection Techniques
