GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models
Fatma Yasmine Loumachi, Mohamed Chahine Ghanem, Mohamed Amine Ferrag

TL;DR
GenDFIR introduces a novel framework that combines large language models with retrieval-augmented generation to automate and enhance cyber incident timeline analysis, demonstrating promising results in synthetic environments.
Contribution
The paper presents GenDFIR, a new approach integrating Llama 3.1 8B with RAG for automated forensic timeline analysis, a significant advancement over traditional methods.
Findings
GenDFIR reliably reconstructs incident timelines from structured data.
The framework demonstrates robustness in synthetic testing environments.
It shows potential to automate complex forensic analysis tasks.
Abstract
Cyber timeline analysis, or forensic timeline analysis, is crucial in Digital Forensics and Incident Response (DFIR). It examines artefacts and events particularly timestamps and metadata to detect anomalies, establish correlations, and reconstruct incident timelines. Traditional methods rely on structured artefacts, such as logs and filesystem metadata, using specialised tools for evidence identification and feature extraction. This paper introduces GenDFIR, a framework leveraging large language models (LLMs), specifically Llama 3.1 8B in zero shot mode, integrated with a Retrieval-Augmented Generation (RAG) agent. Incident data is preprocessed into a structured knowledge base, enabling the RAG agent to retrieve relevant events based on user prompts. The LLM interprets this context, offering semantic enrichment. Tested on synthetic data in a controlled environment, results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data Quality and Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Attention Dropout · Linear Layer · Softmax · Dense Connections · Linear Warmup With Linear Decay · Dropout · WordPiece
