Highway Graph to Accelerate Reinforcement Learning
Zidu Yin, Zhen Zhang, Dong Gong, Stefano V. Albrecht, Javen Q. Shi

TL;DR
This paper introduces the highway graph, a novel compact representation of state transitions that accelerates reinforcement learning by enabling multi-step value propagation, significantly improving training speed and generalization.
Contribution
The paper proposes the highway graph, a new method to streamline value updates in RL by modeling non-branching transition sequences, leading to faster training.
Findings
Achieves 10 to 150 times faster learning in various environments.
Maintains or improves expected returns compared to state-of-the-art RL algorithms.
Enhances generalization and reduces storage costs for neural network agents.
Abstract
Reinforcement Learning (RL) algorithms often struggle with low training efficiency. A common approach to address this challenge is integrating model-based planning algorithms, such as Monte Carlo Tree Search (MCTS) or Value Iteration (VI), into the environmental model. However, VI requires iterating over a large tensor which updates the value of the preceding state based on the succeeding state through value propagation, resulting in computationally intensive operations. To enhance the RL training efficiency, we propose improving the efficiency of the value learning process. In deterministic environments with discrete state and action spaces, we observe that on the sampled empirical state-transition graph, a non-branching sequence of transitions-termed a highway-can take the agent to another state without deviation through intermediate states. On these non-branching highways, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management
MethodsFocus
