Reinforcement Learning for Classical Planning: Viewing Heuristics as   Dense Reward Generators

Clement Gehring; Masataro Asai; Rohan Chitnis; Tom Silver; Leslie Pack; Kaelbling; Shirin Sohrabi; Michael Katz

arXiv:2109.14830·cs.AI·March 8, 2022

Reinforcement Learning for Classical Planning: Viewing Heuristics as Dense Reward Generators

Clement Gehring, Masataro Asai, Rohan Chitnis, Tom Silver, Leslie Pack, Kaelbling, Shirin Sohrabi, Michael Katz

PDF

Open Access

TL;DR

This paper introduces a method that uses classical planning heuristics as dense rewards in reinforcement learning, significantly improving sample efficiency and generalization in classical planning tasks.

Contribution

It proposes a novel approach to integrate classical heuristics into RL as dense reward signals, enhancing learning efficiency and generalization in planning domains.

Findings

01

Improved sample efficiency over sparse-reward RL methods.

02

Effective generalization to new problem instances within the same domain.

03

Successful implementation using Neural Logic Machines.

Abstract

Recent advances in reinforcement learning (RL) have led to a growing interest in applying RL to classical planning domains or applying classical planning methods to some complex RL domains. However, the long-horizon goal-based problems found in classical planning lead to sparse rewards for RL, making direct application inefficient. In this paper, we propose to leverage domain-independent heuristic functions commonly used in the classical planning literature to improve the sample efficiency of RL. These classical heuristics act as dense reward generators to alleviate the sparse-rewards issue and enable our RL agent to learn domain-specific value functions as residuals on these heuristics, making learning easier. Correct application of this technique requires consolidating the discounted metric used in RL and the non-discounted metric used in heuristics. We implement the value functions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReceptor Mechanisms and Signaling · AI-based Problem Solving and Planning · Behavioral and Psychological Studies