Multi-features based Semantic Augmentation Networks for Named Entity   Recognition in Threat Intelligence

Peipei Liu; Hong Li; Zuoguang Wang; Jie Liu; Yimo Ren; Hongsong Zhu

arXiv:2207.00232·cs.CR·July 4, 2022

Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Peipei Liu, Hong Li, Zuoguang Wang, Jie Liu, Yimo Ren, Hongsong Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a semantic augmentation network that combines linguistic features and domain-specific similarities to improve named entity recognition in cybersecurity texts, addressing data sparsity and variability issues.

Contribution

It proposes a novel multi-feature semantic augmentation approach that enhances token representations using linguistic features and domain-specific semantic similarities for better NER performance.

Findings

01

Improved NER accuracy on cybersecurity datasets DNRTI and MalwareTextDB.

02

Effective integration of linguistic features and domain-specific semantic information.

03

Demonstrated robustness in extracting security entities from unstructured texts.

Abstract

Extracting cybersecurity entities such as attackers and vulnerabilities from unstructured network texts is an important part of security analysis. However, the sparsity of intelligence data resulted from the higher frequency variations and the randomness of cybersecurity entity names makes it difficult for current methods to perform well in extracting security-related concepts and entities. To this end, we propose a semantic augmentation method which incorporates different linguistic features to enrich the representation of input tokens to detect and classify the cybersecurity names over unstructured text. In particular, we encode and aggregate the constituent feature, morphological feature and part of speech feature for each input token to improve the robustness of the method. More than that, a token gets augmented semantic information from its most similar K words in cybersecurity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liupeip-cs/ner4cti
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Cybercrime and Law Enforcement Studies · Spam and Phishing Detection