Real-time system optimal traffic routing under uncertainties -- Can physics models boost reinforcement learning?
Zemian Ke, Qiling Zou, Jiachao Liu, Sean Qian

TL;DR
This paper introduces TransRL, a novel reinforcement learning algorithm that integrates physics models to improve real-time traffic routing under uncertainty, achieving better adaptability, reliability, and interpretability in large transportation networks.
Contribution
TransRL combines physics-based deterministic policies with reinforcement learning, guiding learning with a differentiable teacher policy to enhance performance and interpretability in traffic routing.
Findings
TransRL outperforms traditional RL algorithms like PPO and SAC.
TransRL demonstrates superior adaptability and reliability in large networks.
Physics model integration improves interpretability of traffic routing policies.
Abstract
System optimal traffic routing can mitigate congestion by assigning routes for a portion of vehicles so that the total travel time of all vehicles in the transportation system can be reduced. However, achieving real-time optimal routing poses challenges due to uncertain demands and unknown system dynamics, particularly in expansive transportation networks. While physics model-based methods are sensitive to uncertainties and model mismatches, model-free reinforcement learning struggles with learning inefficiencies and interpretability issues. Our paper presents TransRL, a novel algorithm that integrates reinforcement learning with physics models for enhanced performance, reliability, and interpretability. TransRL begins by establishing a deterministic policy grounded in physics models, from which it learns from and is guided by a differentiable and stochastic teacher policy. During…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Security and Resilience · Fault Detection and Control Systems
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Emirates Airlines Office in Dubai · Dilated Convolution · Convolution · 1x1 Convolution · Global Average Pooling · Average Pooling · Entropy Regularization · Adam · Switchable Atrous Convolution
