Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode

Philipp Gassert; Matthias Althoff

arXiv:2410.23419·cs.LG·November 1, 2024

Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode

Philipp Gassert, Matthias Althoff

PDF

Open Access

TL;DR

This paper introduces a shadow mode reinforcement learning approach that trains agents alongside conventional controllers to improve performance in cyber-physical systems while minimizing risks and training time.

Contribution

The novel shadow mode training method allows RL agents to learn from existing controllers, reducing risk and training time in physical systems with no simulation models.

Findings

01

Effective training on reach-avoid task where standard methods fail

02

Low regret during training by combining RL and conventional control

03

Improved system performance over traditional controllers

Abstract

Reinforcement learning (RL) is not yet competitive for many cyber-physical systems, such as robotics, process automation, and power systems, as training on a system with physical components cannot be accelerated, and simulation models do not exist or suffer from a large simulation-to-reality gap. During the long training time, expensive equipment cannot be used and might even be damaged due to inappropriate actions of the reinforcement learning agent. Our novel approach addresses exactly this problem: We train the reinforcement agent in a so-called shadow mode with the assistance of an existing conventional controller, which does not have to be trained and instantaneously performs reasonably well. In shadow mode, the agent relies on the controller to provide action samples and guidance towards favourable states to learn the task, while simultaneously estimating for which states the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making · Cognitive Science and Mapping