Retrieval-Augmented LLMs for Security Incident Analysis
Xavier Cadet, Aditya Vikram Singh, Harsh Mamania, Edward Koh, Alex Fitts, Dirk Van Bruggen, Simona Boboila, Peter Chin, Alina Oprea

TL;DR
This paper introduces a retrieval-augmented system that leverages LLMs and targeted queries to improve security incident analysis, achieving high recall and precision across multiple scenarios while reducing costs.
Contribution
The work presents a novel RAG-based framework combining query filtering and LLM reasoning for efficient, accurate cybersecurity incident analysis with comprehensive evaluation.
Findings
Claude Sonnet 4 achieved 94% recall on malware scenarios.
DeepSeek V3 achieved 89% recall, costing 15 times less than Claude.
Llama 3.1:70b achieved 81% recall at zero per-query cost.
Abstract
Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts, network traffic records, and authentication events. This process is labor-intensive: analysts must sift through large volumes of data to identify relevant indicators and piece together what happened. We present a RAG-based system that performs security incident analysis through targeted query-based filtering and LLM semantic reasoning. The system uses a query library with associated MITRE ATT&CK techniques to extract indicators from raw logs, then retrieves relevant context to answer forensic questions and reconstruct attack sequences. We evaluate the system with eight LLM configurations on malware traffic incidents and a multi-stage Active Directory attack. We find that LLMs have different performance and tradeoffs, with Claude Sonnet 4…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
