Incremental Reinforcement Learning --- a New Continuous Reinforcement   Learning Frame Based on Stochastic Differential Equation methods

Tianhao Chen; Limei Cheng; Yang Liu; Wenchuan Jia; Shugen Ma

arXiv:1908.02974·cs.LG·August 9, 2019·5 cites

Incremental Reinforcement Learning --- a New Continuous Reinforcement Learning Frame Based on Stochastic Differential Equation methods

Tianhao Chen, Limei Cheng, Yang Liu, Wenchuan Jia, Shugen Ma

PDF

Open Access

TL;DR

This paper introduces Incremental Reinforcement Learning (IRL), a novel continuous RL framework based on stochastic differential equations that ensures action continuity, variance control, and proactive environment interaction.

Contribution

The paper presents IRL, a new continuous reinforcement learning method that overcomes theoretical limitations of existing methods by using stochastic differential equations.

Findings

01

Guarantees action continuity within any time interval

02

Controls variance of actions during training

03

Enables agents to predict scene changes for better decision-making

Abstract

Continuous reinforcement learning such as DDPG and A3C are widely used in robot control and autonomous driving. However, both methods have theoretical weaknesses. While DDPG cannot control noises in the control process, A3C does not satisfy the continuity conditions under the Gaussian policy. To address these concerns, we propose a new continues reinforcement learning method based on stochastic differential equations and we call it Incremental Reinforcement Learning (IRL). This method not only guarantees the continuity of actions within any time interval, but controls the variance of actions in the training process. In addition, our method does not assume Markov control in agents' action control and allows agents to predict scene changes for action selection. With our method, agents no longer passively adapt to the environment. Instead, they positively interact with the environment for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Energy, Environment, and Transportation Policies

MethodsExperience Replay · Entropy Regularization · Dense Connections · Weight Decay · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax · Convolution · Batch Normalization · Deep Deterministic Policy Gradient