Deep Reinforcement Learning Behavioral Mode Switching Using Optimal   Control Based on a Latent Space Objective

Sindre Benjamin Remman; Bj{\o}rn Andreas Kristiansen; Anastasios M.; Lekkas

arXiv:2406.01178·cs.LG·June 5, 2024

Deep Reinforcement Learning Behavioral Mode Switching Using Optimal Control Based on a Latent Space Objective

Sindre Benjamin Remman, Bj{\o}rn Andreas Kristiansen, Anastasios M., Lekkas

PDF

Open Access

TL;DR

This paper introduces a method to modify deep reinforcement learning policies by controlling their latent space with optimal control, enabling switching between behavioral modes to improve task success.

Contribution

It presents a novel approach to identify and manipulate behavioral modes in RL policies through latent space optimization using optimal control techniques.

Findings

01

Successfully switches behavioral modes in lunar lander environment

02

Imposes desired behaviors to convert failures into successes

03

Provides a new interpretability filter for neural network policies

Abstract

In this work, we use optimal control to change the behavior of a deep reinforcement learning policy by optimizing directly in the policy's latent space. We hypothesize that distinct behavioral patterns, termed behavioral modes, can be identified within certain regions of a deep reinforcement learning policy's latent space, meaning that specific actions or strategies are preferred within these regions. We identify these behavioral modes using latent space dimension-reduction with \ac*{pacmap}. Using the actions generated by the optimal control procedure, we move the system from one behavioral mode to another. We subsequently utilize these actions as a filter for interpreting the neural network policy. The results show that this approach can impose desired behavioral modes in the policy, demonstrated by showing how a failed episode can be made successful and vice versa using the lunar…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGreenhouse Technology and Climate Control · Energy, Environment, Agriculture Analysis · Reinforcement Learning in Robotics