Shaping Advice in Deep Reinforcement Learning
Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

TL;DR
This paper introduces a method to augment reinforcement learning with shaping advice, using potential functions to improve learning efficiency and performance in sparse reward environments, with theoretical guarantees and practical algorithms for single and multi-agent settings.
Contribution
The paper proposes a novel approach to reward shaping in reinforcement learning using potential functions, providing theoretical analysis, algorithms, and empirical validation for both single and multi-agent systems.
Findings
Shaping advice accelerates policy learning in sparse reward tasks.
Agents with shaping advice achieve higher rewards faster.
Theoretical convergence guarantees are established for the proposed methods.
Abstract
Reinforcement learning involves agents interacting with an environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose to methods to augment the reward signal from the environment with an additional reward termed shaping advice in both single and multi-agent reinforcement learning. The shaping advice is specified as a difference of potential functions at consecutive time-steps. Each potential function is a function of observations and actions of the agents. The use of potential functions is underpinned by an insight that the total potential when starting from any state and returning to the same state is always equal to zero. We show through theoretical analyses and experimental validation that the shaping advice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
