Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening   Search

Taisuke Kobayashi

arXiv:2212.10765·cs.LG·July 4, 2023

Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening Search

Taisuke Kobayashi

PDF

Open Access

TL;DR

This paper proposes a gain scheduling method inspired by iterative deepening search to enhance reinforcement learning by combining bonuses analogous to depth-first and breadth-first search, improving performance across various tasks.

Contribution

It introduces a novel gain scheduling approach that combines two types of intrinsic bonuses, inspired by search algorithms, to improve reinforcement learning efficiency and effectiveness.

Findings

01

Bonuses improve performance in both dense and sparse reward tasks.

02

Gain scheduling enhances the contribution of each bonus type.

03

All tested tasks achieved high performance with the combined method.

Abstract

This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order to efficiently facilitate reinforcement learning search. While various bonuses have been designed to date, they are analogous to the depth-first and breadth-first search algorithms in graph theory. This paper, therefore, first designs two bonuses for each of them. Then, a heuristic gain scheduling is applied to the designed bonuses, inspired by the iterative deepening search, which is known to inherit the advantages of the two search algorithms. The proposed method is expected to allow agent to efficiently reach the best solution in deeper states by gradually exploring unknown states. In three locomotion tasks with dense rewards and three simple tasks with sparse rewards, it is shown that the two types of bonuses contribute to the performance improvement of the different tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems