An experimental study of two predictive reinforcement learning methods and comparison with model-predictive control
Dmitrii Dobriborsci, Pavel Osinenko

TL;DR
This study experimentally compares two predictive reinforcement learning methods with model-predictive control for mobile robot motion, demonstrating that RL methods, especially stacked Q-learning, outperform MPC in accumulated cost.
Contribution
It introduces an experimental evaluation of predictive RL controllers for mobile robots and shows their advantages over traditional MPC, highlighting the potential of stacked Q-learning.
Findings
Both RL methods outperform MPC in accumulated cost.
Stacked Q-learning performs best among tested methods.
RL methods retain MPC's desirable properties while improving performance.
Abstract
Reinforcement learning (RL) has been successfully used in various simulations and computer games. Industry-related applications, such as autonomous mobile robot motion control, are somewhat challenging for RL up to date though. This paper presents an experimental evaluation of predictive RL controllers for optimal mobile robot motion control. As a baseline for comparison, model-predictive control (MPC) is used. Two RL methods are tested: a roll-out Q-learning, which may be considered as MPC with terminal cost being a Q-function approximation, and a so-called stacked Q-learning, which in turn is like MPC with the running cost substituted for a Q-function approximation. The experimental foundation is a mobile robot with a differential drive (Robotis Turtlebot3). Experimental results showed that both RL methods beat the baseline in terms of the accumulated cost, whereas the stacked variant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Iterative Learning Control Systems · Viral Infectious Diseases and Gene Expression in Insects
