Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming
Tong Liu, Lei Lei, Kan Zheng, Kuan Zhang

TL;DR
This paper introduces a novel integrated deep reinforcement learning and dynamic programming approach for autonomous platoon control, enhancing stability and efficiency in multi-vehicle car-following scenarios with unpredictable leaders.
Contribution
It proposes the FH-DDPG-SS algorithm, combining network weight transfer, stationary policy approximation, and state space sweeping to improve platoon control performance.
Findings
FH-DDPG-SS outperforms benchmark algorithms in simulations.
The approach demonstrates improved string stability and safety.
Simulation results validate the effectiveness of the proposed method.
Abstract
Deep Reinforcement Learning (DRL) is regarded as a potential method for car-following control and has been mostly studied to support a single following vehicle. However, it is more challenging to learn a stable and efficient car-following policy when there are multiple following vehicles in a platoon, especially with unpredictable leading vehicle behavior. In this context, we adopt an integrated DRL and Dynamic Programming (DP) approach to learn autonomous platoon control policies, which embeds the Deep Deterministic Policy Gradient (DDPG) algorithm into a finite-horizon value iteration framework. Although the DP framework can improve the stability and performance of DDPG, it has the limitations of lower sampling and training efficiency. In this paper, we propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through reduced state space using Stationary approximation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Dense Connections · Experience Replay · Convolution · Weight Decay · Adam · Deep Deterministic Policy Gradient
