Performance Comparison of Deep RL Algorithms for Mixed Traffic Cooperative Lane-Changing
Xue Yao, Shengren Hou, Serge P. Hoogendoorn, and Simeon C. Calvert

TL;DR
This paper compares deep reinforcement learning algorithms for cooperative lane-changing in mixed traffic, showing PPO's superior performance in safety, efficiency, comfort, and ecological impact.
Contribution
It enhances a previous CLCMT mechanism by incorporating uncertainty and microscopic interactions, and evaluates multiple DRL algorithms for lane-changing strategies.
Findings
PPO outperforms DDPG, TD3, and SAC in safety and efficiency.
DRL algorithms effectively handle traffic uncertainty.
PPO achieves higher rewards with fewer crashes.
Abstract
Lane-changing (LC) is a challenging scenario for connected and automated vehicles (CAVs) because of the complex dynamics and high uncertainty of the traffic environment. This challenge can be handled by deep reinforcement learning (DRL) approaches, leveraging their data-driven and model-free nature. Our previous work proposed a cooperative lane-changing in mixed traffic (CLCMT) mechanism based on TD3 to facilitate an optimal lane-changing strategy. This study enhances the current CLCMT mechanism by considering both the uncertainty of the human-driven vehicles (HVs) and the microscopic interactions between HVs and CAVs. The state-of-the-art (SOTA) DRL algorithms including DDPG, TD3, SAC, and PPO are utilized to deal with the formulated MDP with continuous actions. Performance comparison among the four DRL algorithms demonstrates that DDPG, TD3, and PPO algorithms can deal with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management · Traffic Prediction and Management Techniques · Autonomous Vehicle Technology and Safety
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Adam · Target Policy Smoothing · Dense Connections · Weight Decay · Experience Replay · Convolution · Batch Normalization · Clipped Double Q-learning · 1x1 Convolution
