Subgoal-based Reward Shaping to Improve Efficiency in Reinforcement Learning
Takato Okudo, Seiji Yamada

TL;DR
This paper introduces a subgoal-based reward shaping method that leverages human knowledge of subgoals to significantly improve learning efficiency in reinforcement learning across various domains.
Contribution
The authors extend potential-based reward shaping to incorporate human-understandable subgoals, making it easier for humans to enhance reinforcement learning performance.
Findings
Our method outperforms baseline and other subgoal methods in learning efficiency.
Experiments conducted in three different domains validate the effectiveness of the proposed approach.
Human-provided subgoals significantly accelerate reinforcement learning processes.
Abstract
Reinforcement learning, which acquires a policy maximizing long-term rewards, has been actively studied. Unfortunately, this learning type is too slow and difficult to use in practical situations because the state-action space becomes huge in real environments. Many studies have incorporated human knowledge into reinforcement Learning. Though human knowledge on trajectories is often used, a human could be asked to control an AI agent, which can be difficult. Knowledge on subgoals may lessen this requirement because humans need only to consider a few representative states on an optimal trajectory in their minds. The essential factor for learning efficiency is rewards. Potential-based reward shaping is a basic method for enriching rewards. However, it is often difficult to incorporate subgoals for accelerating learning over potential-based reward shaping. This is because the appropriate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Neural dynamics and brain function
