A New Approach to Training Multiple Cooperative Agents for Autonomous Driving
Ruiyang Yang, Siheng Li, Beihong Jin

TL;DR
This paper introduces Lepus, a novel cooperative training method for multiple autonomous driving agents that enhances stability and collision avoidance through shared policies, reward functions, and adversarial pre-training.
Contribution
Lepus is a new approach that combines shared parameters, adversarial pre-training, and reward approximation to improve cooperative multi-agent autonomous driving.
Findings
Lepus outperforms four baseline methods in stability.
Lepus effectively reduces collisions during driving.
Pre-training via adversarial process enhances collaborative decision-making.
Abstract
Training multiple agents to perform safe and cooperative control in the complex scenarios of autonomous driving has been a challenge. For a small fleet of cars moving together, this paper proposes Lepus, a new approach to training multiple agents. Lepus adopts a pure cooperative manner for training multiple agents, featured with the shared parameters of policy networks and the shared reward function of multiple agents. In particular, Lepus pre-trains the policy networks via an adversarial process, improving its collaborative decision-making capability and further the stability of car driving. Moreover, for alleviating the problem of sparse rewards, Lepus learns an approximate reward function from expert trajectories by combining a random network and a distillation network. We conduct extensive experiments on the MADRaS simulation platform. The experimental results show that multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Traffic control and management · Traffic Prediction and Management Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Adam · Batch Normalization · Weight Decay · Experience Replay · Dense Connections · Convolution · MADDPG
