A generalized stacked reinforcement learning method for sampled systems

Pavel Osinenko; Dmitrii Dobriborsci; Grigory Yaremenko; Georgiy; Malaniya

arXiv:2108.10392·cs.RO·November 29, 2022

A generalized stacked reinforcement learning method for sampled systems

Pavel Osinenko, Dmitrii Dobriborsci, Grigory Yaremenko, Georgiy, Malaniya

PDF

Open Access

TL;DR

This paper introduces and benchmarks two reinforcement learning methods tailored for sampled systems, combining model-predictive control with critic learning to improve performance in discrete-time environments.

Contribution

The paper proposes a hybrid RL approach integrating MPC with critic learning for sampled systems, addressing the gap between continuous-time physical systems and digital RL methods.

Findings

01

Hybrid RL methods outperform traditional approaches in sampled system environments.

02

The proposed methods demonstrate improved control performance in a mobile robot case study.

03

Optimality analysis confirms the effectiveness of the hybrid approach.

Abstract

A common setting of reinforcement learning (RL) is a Markov decision process (MDP) in which the environment is a stochastic discrete-time dynamical system. Whereas MDPs are suitable in such applications as video-games or puzzles, physical systems are time-continuous. A general variant of RL is of digital format, where updates of the value (or cost) and policy are performed at discrete moments in time. The agent-environment loop then amounts to a sampled system, whereby sample-and-hold is a specific case. In this paper, we propose and benchmark two RL methods suitable for sampled systems. Specifically, we hybridize model-predictive control (MPC) with critics learning the optimal Q- and value (or cost-to-go) function. Optimality is analyzed and performance comparison is done in an experimental case study with a mobile robot.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics