A Comparative Evaluation of Teacher-Guided Reinforcement Learning Techniques for Autonomous Cyber Operations

Konur Tholl; Mariam El Mezouar; Ranwa Al Mallah

arXiv:2508.14340·cs.LG·August 21, 2025

A Comparative Evaluation of Teacher-Guided Reinforcement Learning Techniques for Autonomous Cyber Operations

Konur Tholl, Mariam El Mezouar, Ranwa Al Mallah

PDF

Open Access

TL;DR

This paper evaluates four teacher-guided reinforcement learning techniques in autonomous cyber operations, showing that teacher integration enhances training efficiency and convergence speed in cybersecurity agents.

Contribution

It introduces and compares four teacher-guided RL methods applied to autonomous cyber operations, a novel approach in this domain.

Findings

01

Teacher-guided techniques improve early policy performance.

02

Teacher integration accelerates convergence in training.

03

Significant efficiency gains demonstrated in simulated environment.

Abstract

Autonomous Cyber Operations (ACO) rely on Reinforcement Learning (RL) to train agents to make effective decisions in the cybersecurity domain. However, existing ACO applications require agents to learn from scratch, leading to slow convergence and poor early-stage performance. While teacher-guided techniques have demonstrated promise in other domains, they have not yet been applied to ACO. In this study, we implement four distinct teacher-guided techniques in the simulated CybORG environment and conduct a comparative evaluation. Our results demonstrate that teacher integration can significantly improve training efficiency in terms of early policy performance and convergence speed, highlighting its potential benefits for autonomous cybersecurity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Modular Robots and Swarm Intelligence