Graph neural induction of value iteration
Andreea Deac, Pierre-Luc Bacon, Jian Tang

TL;DR
This paper introduces a graph neural network that explicitly executes the value iteration algorithm, enabling more accurate planning in reinforcement learning across diverse environment models with direct supervision on intermediate steps.
Contribution
It proposes a GNN-based approach to directly implement value iteration, relaxing previous environment restrictions and improving planning accuracy in reinforcement learning.
Findings
GNNs accurately model value iteration.
GNNs recover favorable metrics and policies.
Effective across out-of-distribution tests.
Abstract
Many reinforcement learning tasks can benefit from explicit planning based on an internal model of the environment. Previously, such planning components have been incorporated through a neural network that partially aligns with the computational graph of value iteration. Such network have so far been focused on restrictive environments (e.g. grid-worlds), and modelled the planning procedure only indirectly. We relax these constraints, proposing a graph neural network (GNN) that executes the value iteration (VI) algorithm, across arbitrary environment models, with direct supervision on the intermediate steps of VI. The results indicate that GNNs are able to model value iteration accurately, recovering favourable metrics and policies across a variety of out-of-distribution tests. This suggests that GNN executors with strong supervision are a viable component within deep reinforcement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
MethodsGraph Neural Network
