Potential Field Guided Actor-Critic Reinforcement Learning

Weiya Ren

arXiv:2006.06923·cs.LG·June 15, 2020

Potential Field Guided Actor-Critic Reinforcement Learning

Weiya Ren

PDF

Open Access

TL;DR

This paper introduces a novel actor-critic reinforcement learning method that integrates potential-field-based critics with reward-based critics to improve policy evaluation, especially in obstacle avoidance and multi-agent cooperation.

Contribution

The paper proposes combining potential-field-based critics with reward-based critics in actor-critic RL, enhancing policy evaluation using prior information and planning.

Findings

01

Effective obstacle avoidance demonstrated in predator-prey game

02

Potential field integration accelerates policy learning

03

Improves multi-agent cooperation with prior information

Abstract

In this paper, we consider the problem of actor-critic reinforcement learning. Firstly, we extend the actor-critic architecture to actor-critic-N architecture by introducing more critics beyond rewards. Secondly, we combine the reward-based critic with a potential-field-based critic to formulate the proposed potential field guided actor-critic reinforcement learning approach (actor-critic-2). This can be seen as a combination of the model-based gradients and the model-free gradients in policy improvement. State with large potential field often contains a strong prior information, such as pointing to the target at a long distance or avoiding collision by the side of an obstacle. In this situation, we should trust potential-field-based critic more as policy evaluation to accelerate policy improvement, where action policy tends to be guided. For example, in practical application, learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control