TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text

Ahmed Lekssays; Utsav Shukla; Husrev Taha Sencar; Md Rizwan Parvez

arXiv:2505.11988·cs.CR·August 12, 2025

TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text

Ahmed Lekssays, Utsav Shukla, Husrev Taha Sencar, Md Rizwan Parvez

PDF

Open Access 1 Repo 3 Models 1 Video

TL;DR

TechniqueRAG is a retrieval-augmented generation framework tailored for cyber threat intelligence that improves adversarial technique annotation by combining off-the-shelf retrievers, instruction-tuned LLMs, and minimal domain-specific data, achieving state-of-the-art results.

Contribution

It introduces a domain-specific RAG approach that enhances retrieval quality via zero-shot re-ranking and fine-tunes only the generation component, reducing resource needs.

Findings

01

Achieves state-of-the-art performance on security benchmarks.

02

Reduces reliance on large labeled datasets and extensive task-specific tuning.

03

Improves retrieval precision through zero-shot LLM re-ranking.

Abstract

Accurately identifying adversarial techniques in security texts is critical for effective cyber defense. However, existing methods face a fundamental trade-off: they either rely on generic models with limited domain precision or require resource-intensive pipelines that depend on large labeled datasets and task-specific optimizations, such as custom hard-negative mining and denoising, resources rarely available in specialized domains. We propose TechniqueRAG, a domain-specific retrieval-augmented generation (RAG) framework that bridges this gap by integrating off-the-shelf retrievers, instruction-tuned LLMs, and minimal text-technique pairs. Our approach addresses data scarcity by fine-tuning only the generation component on limited in-domain examples, circumventing the need for resource-intensive retrieval training. While conventional RAG mitigates hallucination by coupling retrieval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qcri/techniquerag
noneOfficial

Models

Videos

TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Spam and Phishing Detection