Autonomous Platoon Control with Integrated Deep Reinforcement Learning   and Dynamic Programming

Tong Liu; Lei Lei; Kan Zheng; Kuan Zhang

arXiv:2206.07536·eess.SY·November 21, 2022

Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming

Tong Liu, Lei Lei, Kan Zheng, Kuan Zhang

PDF

TL;DR

This paper introduces a novel integrated deep reinforcement learning and dynamic programming approach for autonomous platoon control, enhancing stability and efficiency in multi-vehicle car-following scenarios with unpredictable leaders.

Contribution

It proposes the FH-DDPG-SS algorithm, combining network weight transfer, stationary policy approximation, and state space sweeping to improve platoon control performance.

Findings

01

FH-DDPG-SS outperforms benchmark algorithms in simulations.

02

The approach demonstrates improved string stability and safety.

03

Simulation results validate the effectiveness of the proposed method.

Abstract

Deep Reinforcement Learning (DRL) is regarded as a potential method for car-following control and has been mostly studied to support a single following vehicle. However, it is more challenging to learn a stable and efficient car-following policy when there are multiple following vehicles in a platoon, especially with unpredictable leading vehicle behavior. In this context, we adopt an integrated DRL and Dynamic Programming (DP) approach to learn autonomous platoon control policies, which embeds the Deep Deterministic Policy Gradient (DDPG) algorithm into a finite-horizon value iteration framework. Although the DP framework can improve the stability and performance of DDPG, it has the limitations of lower sampling and training efficiency. In this paper, we propose an algorithm, namely Finite-Horizon-DDPG with Sweeping through reduced state space using Stationary approximation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Dense Connections · Experience Replay · Convolution · Weight Decay · Adam · Deep Deterministic Policy Gradient