Actively Learning Reinforcement Learning: A Stochastic Optimal Control   Approach

Mohammad S. Ramadan; Mahmoud A. Hayajnh; Michael T. Tolley; Kyriakos; G. Vamvoudakis

arXiv:2309.10831·cs.LG·September 10, 2024

Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach

Mohammad S. Ramadan, Mahmoud A. Hayajnh, Michael T. Tolley, Kyriakos, G. Vamvoudakis

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reinforcement learning framework that combines active exploration with stochastic optimal control to improve decision-making under uncertainty, while addressing computational challenges and enabling real-time cautious exploration.

Contribution

It presents a novel RL approach that integrates stochastic optimal control to facilitate active learning, automatic exploration, and computational efficiency in uncertain environments.

Findings

01

The proposed method stabilizes control performance under uncertainty.

02

It outperforms certainty equivalence-based LQ regulators in simulations.

03

The approach enables real-time cautious exploration and exploitation.

Abstract

In this paper we propose a framework towards achieving two intertwined objectives: (i) equipping reinforcement learning with active exploration and deliberate information gathering, such that it regulates state and parameter uncertainties resulting from modeling mismatches and noisy sensory; and (ii) overcoming the computational intractability of stochastic optimal control. We approach both objectives by using reinforcement learning to compute the stochastic optimal control law. On one hand, we avoid the curse of dimensionality prohibiting the direct solution of the stochastic dynamic programming equation. On the other hand, the resulting stochastic optimal control reinforcement learning agent admits caution and probing, that is, optimal online exploration and exploitation. Unlike fixed exploration and exploitation balance, caution and probing are employed automatically by the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

msramada/active-learning-reinforcement-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Supply Chain and Inventory Management