Offline Reinforcement Learning for Road Traffic Control
Mayuresh Kunjir, Sanjay Chawla

TL;DR
This paper introduces a model-based offline reinforcement learning framework for traffic signal control that leverages real traffic data to develop efficient and effective policies, avoiding simulation inaccuracies.
Contribution
It proposes a novel model-based offline RL approach with adaptive reward shaping for traffic control, improving data efficiency and policy performance.
Findings
High-performance traffic control policies achieved from real data
Adaptive reward shaping improves regularization and generalization
Model-based offline RL outperforms prior methods in complex scenarios
Abstract
Traffic signal control is an important problem in urban mobility with a significant potential of economic and environmental impact. While there is a growing interest in Reinforcement Learning (RL) for traffic signal control, the work so far has focussed on learning through simulations which could lead to inaccuracies due to simplifying assumptions. Instead, real experience data on traffic is available and could be exploited at minimal costs. Recent progress in offline or batch RL has enabled just that. Model-based offline RL methods, in particular, have been shown to generalize from the experience data much better than others. We build a model-based learning framework which infers a Markov Decision Process (MDP) from a dataset collected using a cyclic traffic signal control policy that is both commonplace and easy to gather. The MDP is built with pessimistic costs to manage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management · Traffic Prediction and Management Techniques · Traffic and Road Safety
