AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset
Pritam Deka, Sampath Rajapaksha, Ruby Rani, Amirah Almutairi, Erisa, Karafili

TL;DR
This paper introduces a novel dataset for cyber-attack attribution using Named Entity Recognition (NER), enabling NLP techniques and Large Language Models to improve attribution accuracy in cybersecurity texts.
Contribution
The paper provides the first comprehensive dataset for cyber-attack attribution with rich annotations, facilitating NLP-based analysis and supporting the development of automated attribution methods.
Findings
The dataset effectively supports attack attribution tasks.
NLP techniques, especially Large Language Models, enhance NER performance in cybersecurity.
Experiments demonstrate improved attribution accuracy using the dataset.
Abstract
Cyber-attack attribution is an important process that allows experts to put in place attacker-oriented countermeasures and legal actions. The analysts mainly perform attribution manually, given the complex nature of this task. AI and, more specifically, Natural Language Processing (NLP) techniques can be leveraged to support cybersecurity analysts during the attribution process. However powerful these techniques are, they need to deal with the lack of datasets in the attack attribution domain. In this work, we will fill this gap and will provide, to the best of our knowledge, the first dataset on cyber-attack attribution. We designed our dataset with the primary goal of extracting attack attribution information from cybersecurity texts, utilizing named entity recognition (NER) methodologies from the field of NLP. Unlike other cybersecurity NER datasets, ours offers a rich set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
