A reinforcement learning approach to hybrid control design
Meet Gandhi, Atreyee Kundu, Shalabh Bhatnagar

TL;DR
This paper introduces a reinforcement learning framework for designing hybrid control policies in systems with unknown models, utilizing a single MDP formulation and adapting PPO for hybrid actions, achieving convergence to optimal policies.
Contribution
The paper presents a novel MDP-based framework for hybrid control design and adapts PPO to hybrid action spaces, enabling model-free optimal control policy learning.
Findings
PPO converges to optimal policies in hybrid control problems.
The MDP framework simplifies hybrid control design.
The approach is applicable to benchmark hybrid systems.
Abstract
In this paper we design hybrid control policies for hybrid systems whose mathematical models are unknown. Our contributions are threefold. First, we propose a framework for modelling the hybrid control design problem as a single Markov Decision Process (MDP). This result facilitates the application of off-the-shelf algorithms from Reinforcement Learning (RL) literature towards designing optimal control policies. Second, we model a set of benchmark examples of hybrid control design problem in the proposed MDP framework. Third, we adapt the recently proposed Proximal Policy Optimisation (PPO) algorithm for the hybrid action space and apply it to the above set of problems. It is observed that in each case the algorithm converges and finds the optimal policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Electric Vehicles and Infrastructure · Adaptive Dynamic Programming Control
