Learning General Policies with Policy Gradient Methods
Simon St{\aa}hlberg, Blai Bonet, Hector Geffner

TL;DR
This paper explores how deep reinforcement learning, combined with graph neural networks, can learn generalizable policies similar to classical combinatorial methods, addressing scalability and expressiveness limitations.
Contribution
It introduces a framework that models policies as state transition classifiers using GNNs, enabling policy generalization comparable to combinatorial approaches without scalability issues.
Findings
Actor-critic methods achieve near-combinatorial generalization
Limitations stem from GNN expressiveness and policy optimality tradeoffs
Adding derived predicates improves policy generalization
Abstract
While reinforcement learning methods have delivered remarkable results in a number of settings, generalization, i.e., the ability to produce policies that generalize in a reliable and systematic way, has remained a challenge. The problem of generalization has been addressed formally in classical planning where provable correct policies that generalize over all instances of a given domain have been learned using combinatorial methods. The aim of this work is to bring these two research threads together to illuminate the conditions under which (deep) reinforcement learning approaches, and in particular, policy optimization methods, can be used to learn policies that generalize like combinatorial methods do. We draw on lessons learned from previous combinatorial and deep learning approaches, and extend them in a convenient way. From the former, we model policies as state transition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Artificial Intelligence in Games
