Shaping Advice in Deep Reinforcement Learning

Baicen Xiao; Bhaskar Ramasubramanian; Radha Poovendran

arXiv:2202.09489·cs.MA·February 22, 2022

Shaping Advice in Deep Reinforcement Learning

Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to augment reinforcement learning with shaping advice, using potential functions to improve learning efficiency and performance in sparse reward environments, with theoretical guarantees and practical algorithms for single and multi-agent settings.

Contribution

The paper proposes a novel approach to reward shaping in reinforcement learning using potential functions, providing theoretical analysis, algorithms, and empirical validation for both single and multi-agent systems.

Findings

01

Shaping advice accelerates policy learning in sparse reward tasks.

02

Agents with shaping advice achieve higher rewards faster.

03

Theoretical convergence guarantees are established for the proposed methods.

Abstract

Reinforcement learning involves agents interacting with an environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose to methods to augment the reward signal from the environment with an additional reward termed shaping advice in both single and multi-agent reinforcement learning. The shaping advice is specified as a difference of potential functions at consecutive time-steps. Each potential function is a function of observations and actions of the agents. The use of potential functions is underpinned by an insight that the total potential when starting from any state and returning to the same state is always equal to zero. We show through theoretical analyses and experimental validation that the shaping advice…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

baicenxiao/shaping-advice
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics