AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity   Recognition Dataset

Pritam Deka; Sampath Rajapaksha; Ruby Rani; Amirah Almutairi; Erisa; Karafili

arXiv:2408.05149·cs.CR·August 12, 2024

AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset

Pritam Deka, Sampath Rajapaksha, Ruby Rani, Amirah Almutairi, Erisa, Karafili

PDF

TL;DR

This paper introduces a novel dataset for cyber-attack attribution using Named Entity Recognition (NER), enabling NLP techniques and Large Language Models to improve attribution accuracy in cybersecurity texts.

Contribution

The paper provides the first comprehensive dataset for cyber-attack attribution with rich annotations, facilitating NLP-based analysis and supporting the development of automated attribution methods.

Findings

01

The dataset effectively supports attack attribution tasks.

02

NLP techniques, especially Large Language Models, enhance NER performance in cybersecurity.

03

Experiments demonstrate improved attribution accuracy using the dataset.

Abstract

Cyber-attack attribution is an important process that allows experts to put in place attacker-oriented countermeasures and legal actions. The analysts mainly perform attribution manually, given the complex nature of this task. AI and, more specifically, Natural Language Processing (NLP) techniques can be leveraged to support cybersecurity analysts during the attribution process. However powerful these techniques are, they need to deal with the lack of datasets in the attack attribution domain. In this work, we will fill this gap and will provide, to the best of our knowledge, the first dataset on cyber-attack attribution. We designed our dataset with the primary goal of extracting attack attribution information from cybersecurity texts, utilizing named entity recognition (NER) methodologies from the field of NLP. Unlike other cybersecurity NER datasets, ours offers a rich set of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training