TL;DR
DGLight is a critic-guided reinforcement learning framework that fine-tunes large language models for traffic signal control, achieving interpretable reasoning and strong performance on benchmark datasets.
Contribution
Introduces DGLight, combining a critic-guided approach with GRPO to adapt LLMs for traffic signal control, emphasizing interpretability and transferability.
Findings
DGLight outperforms other LLM-based controllers on TSC benchmarks.
The method remains competitive with strong RL baselines.
Generated reasoning traces are interpretable and aligned with signal decisions.
Abstract
Traffic signal control (TSC) plays a central role in reducing congestion and maintaining urban mobility. This dissertation introduces DGLight, a critic-guided reinforcement-learning framework for adapting a pretrained large language model to TSC. DGLight first trains a CoLight-based Deep Q-Network critic to estimate traffic-aware action values from structured intersection states, then uses the frozen critic to score candidate language-model actions and optimize the policy with Group Relative Policy Optimization (GRPO). The resulting controller maps traffic states to interpretable reasoning traces and signal decisions while learning from dense per-state supervision rather than raw cumulative environment rewards. Experiments on TSC benchmarks covering Jinan and Hangzhou show that DGLight is the strongest overall method among the compared LLM-based controllers, remains competitive with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
