Hierarchical Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text
Filippo Morbiato, Markus Keller, Priya Nair, Luca Romano

TL;DR
This paper introduces H-TechniqueRAG, a hierarchical retrieval-augmented generation framework that leverages the ATT&CK taxonomy to improve cyber threat technique annotation accuracy and efficiency.
Contribution
It proposes a novel hierarchical retrieval approach that incorporates the ATT&CK framework's taxonomy, significantly enhancing annotation performance and interpretability.
Findings
Outperforms state-of-the-art TechniqueRAG by 3.8% in F1 score.
Reduces inference latency by 62.4%.
Decreases LLM API calls by 60%.
Abstract
Mapping Cyber Threat Intelligence (CTI) text to MITRE ATT\&CK technique IDs is a critical task for understanding adversary behaviors and automating threat defense. While recent Retrieval-Augmented Generation (RAG) approaches have demonstrated promising capabilities in this domain, they fundamentally rely on a flat retrieval paradigm. By treating all techniques uniformly, these methods overlook the inherent taxonomy of the ATT\&CK framework, where techniques are structurally organized under high-level tactics. In this paper, we propose H-TechniqueRAG, a novel hierarchical RAG framework that injects this tactic-technique taxonomy as a strong inductive bias to achieve highly efficient and accurate annotation. Our approach introduces a two-stage hierarchical retrieval mechanism: it first identifies the macro-level tactics (the adversary's technical goals) and subsequently narrows the search…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
