Learning General Policies with Policy Gradient Methods

Simon St{\aa}hlberg; Blai Bonet; Hector Geffner

arXiv:2512.19366·cs.AI·December 23, 2025

Learning General Policies with Policy Gradient Methods

Simon St{\aa}hlberg, Blai Bonet, Hector Geffner

PDF

Open Access

TL;DR

This paper explores how deep reinforcement learning, combined with graph neural networks, can learn generalizable policies similar to classical combinatorial methods, addressing scalability and expressiveness limitations.

Contribution

It introduces a framework that models policies as state transition classifiers using GNNs, enabling policy generalization comparable to combinatorial approaches without scalability issues.

Findings

01

Actor-critic methods achieve near-combinatorial generalization

02

Limitations stem from GNN expressiveness and policy optimality tradeoffs

03

Adding derived predicates improves policy generalization

Abstract

While reinforcement learning methods have delivered remarkable results in a number of settings, generalization, i.e., the ability to produce policies that generalize in a reliable and systematic way, has remained a challenge. The problem of generalization has been addressed formally in classical planning where provable correct policies that generalize over all instances of a given domain have been learned using combinatorial methods. The aim of this work is to bring these two research threads together to illuminate the conditions under which (deep) reinforcement learning approaches, and in particular, policy optimization methods, can be used to learn policies that generalize like combinatorial methods do. We draw on lessons learned from previous combinatorial and deep learning approaches, and extend them in a convenient way. From the former, we model policies as state transition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Artificial Intelligence in Games