Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence
Peipei Liu, Hong Li, Zuoguang Wang, Jie Liu, Yimo Ren, Hongsong Zhu

TL;DR
This paper introduces a semantic augmentation network that combines linguistic features and domain-specific similarities to improve named entity recognition in cybersecurity texts, addressing data sparsity and variability issues.
Contribution
It proposes a novel multi-feature semantic augmentation approach that enhances token representations using linguistic features and domain-specific semantic similarities for better NER performance.
Findings
Improved NER accuracy on cybersecurity datasets DNRTI and MalwareTextDB.
Effective integration of linguistic features and domain-specific semantic information.
Demonstrated robustness in extracting security entities from unstructured texts.
Abstract
Extracting cybersecurity entities such as attackers and vulnerabilities from unstructured network texts is an important part of security analysis. However, the sparsity of intelligence data resulted from the higher frequency variations and the randomness of cybersecurity entity names makes it difficult for current methods to perform well in extracting security-related concepts and entities. To this end, we propose a semantic augmentation method which incorporates different linguistic features to enrich the representation of input tokens to detect and classify the cybersecurity names over unstructured text. In particular, we encode and aggregate the constituent feature, morphological feature and part of speech feature for each input token to improve the robustness of the method. More than that, a token gets augmented semantic information from its most similar K words in cybersecurity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Cybercrime and Law Enforcement Studies · Spam and Phishing Detection
