Highway Graph to Accelerate Reinforcement Learning

Zidu Yin; Zhen Zhang; Dong Gong; Stefano V. Albrecht; Javen Q. Shi

arXiv:2405.11727·cs.LG·January 8, 2025

Highway Graph to Accelerate Reinforcement Learning

Zidu Yin, Zhen Zhang, Dong Gong, Stefano V. Albrecht, Javen Q. Shi

PDF

Open Access 2 Repos

TL;DR

This paper introduces the highway graph, a novel compact representation of state transitions that accelerates reinforcement learning by enabling multi-step value propagation, significantly improving training speed and generalization.

Contribution

The paper proposes the highway graph, a new method to streamline value updates in RL by modeling non-branching transition sequences, leading to faster training.

Findings

01

Achieves 10 to 150 times faster learning in various environments.

02

Maintains or improves expected returns compared to state-of-the-art RL algorithms.

03

Enhances generalization and reduces storage costs for neural network agents.

Abstract

Reinforcement Learning (RL) algorithms often struggle with low training efficiency. A common approach to address this challenge is integrating model-based planning algorithms, such as Monte Carlo Tree Search (MCTS) or Value Iteration (VI), into the environmental model. However, VI requires iterating over a large tensor which updates the value of the preceding state based on the succeeding state through value propagation, resulting in computationally intensive operations. To enhance the RL training efficiency, we propose improving the efficiency of the value learning process. In deterministic environments with discrete state and action spaces, we observe that on the sampled empirical state-transition graph, a non-branching sequence of transitions-termed a highway-can take the agent to another state without deviation through intermediate states. On these non-branching highways, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic control and management

MethodsFocus