DGLight: DQN-Guided GRPO Fine-Tuning of Large Language Models for Traffic Signal Control

Chenbo Yu

arXiv:2604.25259·cs.LG·April 29, 2026

DGLight: DQN-Guided GRPO Fine-Tuning of Large Language Models for Traffic Signal Control

Chenbo Yu

PDF

1 Repo

TL;DR

DGLight is a critic-guided reinforcement learning framework that fine-tunes large language models for traffic signal control, achieving interpretable reasoning and strong performance on benchmark datasets.

Contribution

Introduces DGLight, combining a critic-guided approach with GRPO to adapt LLMs for traffic signal control, emphasizing interpretability and transferability.

Findings

01

DGLight outperforms other LLM-based controllers on TSC benchmarks.

02

The method remains competitive with strong RL baselines.

03

Generated reasoning traces are interpretable and aligned with signal decisions.

Abstract

Traffic signal control (TSC) plays a central role in reducing congestion and maintaining urban mobility. This dissertation introduces DGLight, a critic-guided reinforcement-learning framework for adapting a pretrained large language model to TSC. DGLight first trains a CoLight-based Deep Q-Network critic to estimate traffic-aware action values from structured intersection states, then uses the frozen critic to score candidate language-model actions and optimize the policy with Group Relative Policy Optimization (GRPO). The resulting controller maps traffic states to interpretable reasoning traces and signal decisions while learning from dense per-state supervision rather than raw cumulative environment rewards. Experiments on TSC benchmarks covering Jinan and Hangzhou show that DGLight is the strongest overall method among the compared LLM-based controllers, remains competitive with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yyccbb/FYP_LLMTSC
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.