Towards Sample Efficient Agents through Algorithmic Alignment
Mingxuan Li, Michael L. Littman

TL;DR
This paper introduces Deep Graph Value Network, a graph neural network-based method that leverages structured algorithms like dynamic programming to improve sample efficiency in deep reinforcement learning agents.
Contribution
It proposes DeepGV, a novel graph neural network approach that incorporates algorithmic structures to enhance sample efficiency in reinforcement learning.
Findings
DeepGV outperforms unstructured baselines in MDP tasks.
Structured computation in neural networks improves learning efficiency.
The approach opens new avenues for structured agent design.
Abstract
In this work, we propose and explore Deep Graph Value Network (DeepGV) as a promising method to work around sample complexity in deep reinforcement-learning agents using a message-passing mechanism. The main idea is that the agent should be guided by structured non-neural-network algorithms like dynamic programming. According to recent advances in algorithmic alignment, neural networks with structured computation procedures can be trained efficiently. We demonstrate the potential of graph neural network in supporting sample efficient learning by showing that Deep Graph Value Network can outperform unstructured baselines by a large margin in solving the Markov Decision Process (MDP). We believe this would open up a new avenue for structured agent design. See https://github.com/drmeerkat/Deep-Graph-Value-Network for the code.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Graph Neural Networks · Data Stream Mining Techniques
MethodsGraph Neural Network
