Hierarchical Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text

Filippo Morbiato; Markus Keller; Priya Nair; Luca Romano

arXiv:2604.14166·cs.CL·April 17, 2026

Hierarchical Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text

Filippo Morbiato, Markus Keller, Priya Nair, Luca Romano

PDF

TL;DR

This paper introduces H-TechniqueRAG, a hierarchical retrieval-augmented generation framework that leverages the ATT&CK taxonomy to improve cyber threat technique annotation accuracy and efficiency.

Contribution

It proposes a novel hierarchical retrieval approach that incorporates the ATT&CK framework's taxonomy, significantly enhancing annotation performance and interpretability.

Findings

01

Outperforms state-of-the-art TechniqueRAG by 3.8% in F1 score.

02

Reduces inference latency by 62.4%.

03

Decreases LLM API calls by 60%.

Abstract

Mapping Cyber Threat Intelligence (CTI) text to MITRE ATT\&CK technique IDs is a critical task for understanding adversary behaviors and automating threat defense. While recent Retrieval-Augmented Generation (RAG) approaches have demonstrated promising capabilities in this domain, they fundamentally rely on a flat retrieval paradigm. By treating all techniques uniformly, these methods overlook the inherent taxonomy of the ATT\&CK framework, where techniques are structurally organized under high-level tactics. In this paper, we propose H-TechniqueRAG, a novel hierarchical RAG framework that injects this tactic-technique taxonomy as a strong inductive bias to achieve highly efficient and accurate annotation. Our approach introduces a two-stage hierarchical retrieval mechanism: it first identifies the macro-level tactics (the adversary's technical goals) and subsequently narrows the search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.