Large Language Model-Based Reward Design for Deep Reinforcement Learning-Driven Autonomous Cyber Defense
Sayak Mukherjee, Samrat Chatterjee, Emilie Purvine, Ted Fujimoto, Tegan Emerson

TL;DR
This paper introduces a novel approach using large language models to design rewards for deep reinforcement learning agents in autonomous cyber defense, enabling the development of effective defense strategies in complex, dynamic environments.
Contribution
It presents a new LLM-based reward design method for cyber defense, incorporating heterogeneous agent personas and contextual information to improve DRL policy learning.
Findings
LLM-guided rewards improve defense policy effectiveness
Heterogeneous agent personas enhance simulation realism
Effective strategies against diverse cyber adversaries
Abstract
Designing rewards for autonomous cyber attack and defense learning agents in a complex, dynamic environment is a challenging task for subject matter experts. We propose a large language model (LLM)-based reward design approach to generate autonomous cyber defense policies in a deep reinforcement learning (DRL)-driven experimental simulation environment. Multiple attack and defense agent personas were crafted, reflecting heterogeneity in agent actions, to generate LLM-guided reward designs where the LLM was first provided with contextual cyber simulation environment information. These reward structures were then utilized within a DRL-driven attack-defense simulation environment to learn an ensemble of cyber defense policies. Our results suggest that LLM-guided reward designs can lead to effective defense strategies against diverse adversarial behaviors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Information and Cyber Security
