Large Language Model-Based Reward Design for Deep Reinforcement Learning-Driven Autonomous Cyber Defense

Sayak Mukherjee; Samrat Chatterjee; Emilie Purvine; Ted Fujimoto; Tegan Emerson

arXiv:2511.16483·cs.LG·November 21, 2025

Large Language Model-Based Reward Design for Deep Reinforcement Learning-Driven Autonomous Cyber Defense

Sayak Mukherjee, Samrat Chatterjee, Emilie Purvine, Ted Fujimoto, Tegan Emerson

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel approach using large language models to design rewards for deep reinforcement learning agents in autonomous cyber defense, enabling the development of effective defense strategies in complex, dynamic environments.

Contribution

It presents a new LLM-based reward design method for cyber defense, incorporating heterogeneous agent personas and contextual information to improve DRL policy learning.

Findings

01

LLM-guided rewards improve defense policy effectiveness

02

Heterogeneous agent personas enhance simulation realism

03

Effective strategies against diverse cyber adversaries

Abstract

Designing rewards for autonomous cyber attack and defense learning agents in a complex, dynamic environment is a challenging task for subject matter experts. We propose a large language model (LLM)-based reward design approach to generate autonomous cyber defense policies in a deep reinforcement learning (DRL)-driven experimental simulation environment. Multiple attack and defense agent personas were crafted, reflecting heterogeneity in agent actions, to generate LLM-guided reward designs where the LLM was first provided with contextual cyber simulation environment information. These reward structures were then utilized within a DRL-driven attack-defense simulation environment to learn an ensemble of cyber defense policies. Our results suggest that LLM-guided reward designs can lead to effective defense strategies against diverse adversarial behaviors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Large Language Model-Based Reward Design for Deep Reinforcement Learning-Driven Autonomous Cyber Defense· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Information and Cyber Security