Automatic Identification of Indicators of Compromise using Neural-Based Sequence Labelling
Shengping Zhou, Zi Long, Lianzhi Tan, Hao Guo

TL;DR
This paper introduces a neural sequence labelling approach with attention mechanisms to automatically identify Indicators of Compromise in cybersecurity reports, eliminating the need for expert-crafted features.
Contribution
It is the first to apply end-to-end neural sequence labelling with attention to IOCs detection, improving accuracy over existing models without requiring cybersecurity expertise.
Findings
Achieves over 88% average F1-score in IOCs identification
Outperforms other sequence labelling models
Effectively detects low frequency IOCs in long sentences
Abstract
Indicators of Compromise (IOCs) are artifacts observed on a network or in an operating system that can be utilized to indicate a computer intrusion and detect cyber-attacks in an early stage. Thus, they exert an important role in the field of cybersecurity. However, state-of-the-art IOCs detection systems rely heavily on hand-crafted features with expert knowledge of cybersecurity, and require a large amount of supervised training corpora to train an IOC classifier. In this paper, we propose using a neural-based sequence labelling model to identify IOCs automatically from reports on cybersecurity without expert knowledge of cybersecurity. Our work is the first to apply an end-to-end sequence labelling to the task in IOCs identification. By using an attention mechanism and several token spelling features, we find that the proposed model is capable of identifying the low frequency IOCs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Spam and Phishing Detection
