Cyber Knowledge Completion Using Large Language Models
Braden K Webb, Sumit Purohit, Rounak Meyur

TL;DR
This paper explores leveraging Large Language Models and retrieval-augmented generation to enhance cyber-attack knowledge graphs, addressing incomplete cybersecurity information in IoT-enabled cyber-physical systems.
Contribution
It introduces a novel RAG-based framework utilizing LLMs for cyber-attack knowledge completion and compares it with traditional classification methods.
Findings
RAG-based approach outperforms baseline models in mapping attack patterns.
Embedding models effectively encode attack and adversarial information.
Framework improves completeness of cyber-attack knowledge graphs.
Abstract
The integration of the Internet of Things (IoT) into Cyber-Physical Systems (CPSs) has expanded their cyber-attack surface, introducing new and sophisticated threats with potential to exploit emerging vulnerabilities. Assessing the risks of CPSs is increasingly difficult due to incomplete and outdated cybersecurity knowledge. This highlights the urgent need for better-informed risk assessments and mitigation strategies. While previous efforts have relied on rule-based natural language processing (NLP) tools to map vulnerabilities, weaknesses, and attack patterns, recent advancements in Large Language Models (LLMs) present a unique opportunity to enhance cyber-attack knowledge completion through improved reasoning, inference, and summarization capabilities. We apply embedding models to encapsulate information on attack patterns and adversarial techniques, generating mappings between them…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
