Hallucination-Resistant Security Planning with a Large Language Model
Kim Hammar, Tansu Alpcan, and Emil Lupu

TL;DR
This paper presents a framework that uses large language models for security management, reducing hallucinations and improving incident response planning through iterative decision support and external feedback integration.
Contribution
It introduces a novel iterative framework with consistency checks and feedback for LLMs in security tasks, controlling hallucination risk and providing theoretical regret bounds.
Findings
Reduces recovery times by up to 30% in experiments
Controls hallucination risk via consistency threshold tuning
Establishes a regret bound for in-context learning under certain conditions
Abstract
Large language models (LLMs) are promising tools for supporting security management tasks, such as incident response planning. However, their unreliability and tendency to hallucinate remain significant challenges. In this paper, we address these challenges by introducing a principled framework for using an LLM as decision support in security management. Our framework integrates the LLM in an iterative loop where it generates candidate actions that are checked for consistency with system constraints and lookahead predictions. When consistency is low, we abstain from the generated actions and instead collect external feedback, e.g., by evaluating actions in a digital twin. This feedback is then used to refine the candidate actions through in-context learning (ICL). We prove that this design allows to control the hallucination risk by tuning the consistency threshold. Moreover, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Adversarial Robustness in Machine Learning · Advanced Graph Neural Networks
