# Processing Tweets for Cybersecurity Threat Awareness

**Authors:** Fernando Alves, Aur\'elien Bettini, Pedro M. Ferreira, Alysson Bessani

arXiv: 1904.02072 · 2019-04-04

## TL;DR

This paper introduces SYNAPSE, a Twitter-based streaming system that efficiently filters, classifies, clusters, and summarizes cybersecurity threat information from tweets to enhance real-time threat awareness for IT infrastructures.

## Contribution

The paper presents a novel Twitter-based threat monitoring pipeline with an innovative clustering strategy and demonstrates its effectiveness through extensive quantitative and qualitative evaluations.

## Key findings

- True positive rate above 90% for relevant tweets
- False positive rate below 10%
- Generates relevant, timely IoCs with few false positives

## Abstract

Receiving timely and relevant security information is crucial for maintaining a high-security level on an IT infrastructure. This information can be extracted from Open Source Intelligence published daily by users, security organisations, and researchers. In particular, Twitter has become an information hub for obtaining cutting-edge information about many subjects, including cybersecurity. This work proposes SYNAPSE, a Twitter-based streaming threat monitor that generates a continuously updated summary of the threat landscape related to a monitored infrastructure. Its tweet-processing pipeline is composed of filtering, feature extraction, binary classification, an innovative clustering strategy, and generation of Indicators of Compromise (IoCs). A quantitative evaluation considering all tweets from 80 accounts over more than 8 months (over 195.000 tweets), shows that our approach timely and successfully finds the majority of security-related tweets concerning an example IT infrastructure (true positive rate above 90%), incorrectly selects a small number of tweets as relevant (false positive rate under 10%), and summarises the results to very few IoCs per day. A qualitative evaluation of the IoCs generated by SYNAPSE demonstrates their relevance (based on the CVSS score and the availability of patches or exploits), and timeliness (based on threat disclosure dates from NVD).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.02072/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1904.02072/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1904.02072/full.md

---
Source: https://tomesphere.com/paper/1904.02072