# Reinforcement Learning for Traffic Control with Adaptive Horizon

**Authors:** Wentao Chen, Tehuan Chen, Guang Lin

arXiv: 1903.12348 · 2019-04-01

## TL;DR

This paper introduces an adaptive horizon reinforcement learning approach for traffic light control, which outperforms traditional MPC methods and remains robust under traffic uncertainties.

## Contribution

It presents a novel Q-learning-based traffic control method with an adaptive action space optimization, improving efficiency and robustness over existing approaches.

## Key findings

- Outperforms MPC in convergence speed and control costs
- Maintains robustness under 30% traffic demand uncertainty
- Effective adaptive action space reduces computational complexity

## Abstract

This paper proposes a reinforcement learning approach for traffic control with the adaptive horizon. To build the controller for the traffic network, a Q-learning-based strategy that controls the green light passing time at the network intersections is applied. The controller includes two components: the regular Q-learning controller that controls the traffic light signal, and the adaptive controller that continuously optimizes the action space for the Q-learning algorithm in order to improve the efficiency of the Q-learning algorithm. The regular Q-learning controller uses the control cost function as a reward function to determine the action to choose. The adaptive controller examines the control cost and updates the action space of the controller by determining the subset of actions that are most likely to obtain optimal results and shrinking the action space to that subset. Uncertainties in traffic influx and turning rate are introduced to test the robustness of the controller under a stochastic environment. Compared with those with model predictive control (MPC), the results show that the proposed Q-learning-based controller outperforms the MPC method by reaching a stable solution in a shorter period and achieves lower control costs. The proposed Q-learning-based controller is also robust under 30% traffic demand uncertainty and 15% turning rate uncertainty.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.12348/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1903.12348/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1903.12348/full.md

---
Source: https://tomesphere.com/paper/1903.12348