Continuous Deep Q-Learning with Simulator for Stabilization of Uncertain   Discrete-Time Systems

Junya Ikemoto; Toshimitsu Ushio

arXiv:2101.05640·cs.LG·April 20, 2021

Continuous Deep Q-Learning with Simulator for Stabilization of Uncertain Discrete-Time Systems

Junya Ikemoto, Toshimitsu Ushio

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage reinforcement learning method using simulators with multiple system models to stabilize uncertain discrete-time systems, improving learning efficiency and robustness.

Contribution

The proposed approach combines multiple virtual system models with continuous deep Q-learning to enhance policy stability under model uncertainties.

Findings

01

Effective stabilization demonstrated in numerical simulations.

02

Improved learning efficiency over traditional RL methods.

03

Robustness against identification errors in system parameters.

Abstract

Applications of reinforcement learning (RL) to stabilization problems of real systems are restricted since an agent needs many experiences to learn an optimal policy and may determine dangerous actions during its exploration. If we know a mathematical model of a real system, a simulator is useful because it predicates behaviors of the real system using the mathematical model with a given system parameter vector. We can collect many experiences more efficiently than interactions with the real system. However, it is difficult to identify the system parameter vector accurately. If we have an identification error, experiences obtained by the simulator may degrade the performance of the learned policy. Thus, we propose a practical RL algorithm that consists of two stages. At the first stage, we choose multiple system parameter vectors. Then, we have a mathematical model for each system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pondbooks/CDQL_with_Sim
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Advanced Control Systems Optimization

MethodsQ-Learning