Multi-Agent Reinforcement Learning for Safe Autonomous Driving Under Pedestrian Behavioral Uncertainty

Prakash Aryan; Kaushik Raghupathruni; Timo Kehrer; Sebastiano Panichella

arXiv:2605.20255·cs.LG·May 21, 2026

Multi-Agent Reinforcement Learning for Safe Autonomous Driving Under Pedestrian Behavioral Uncertainty

Prakash Aryan, Kaushik Raghupathruni, Timo Kehrer, Sebastiano Panichella

PDF

TL;DR

This paper uses multi-agent reinforcement learning to create more realistic pedestrian behaviors in autonomous driving simulations, improving safety assessment accuracy.

Contribution

It introduces a MARL environment with co-trained pedestrians and an SDC, capturing pedestrian heterogeneity and behavioral uncertainty, especially in jaywalking scenarios.

Findings

01

Co-trained SDC achieved 78% goal success with 14% collision rate.

02

Jaywalking caused 62% of collisions despite being 13% of crossings.

03

MARL pedestrian training reduced collisions by 30% compared to single-agent RL.

Abstract

Simulation-based testing of self-driving cars (SDCs) typically relies on scripted or simplified pedestrian models that do not capture the heterogeneity and uncertainty of real human crossing behavior. This limits the realism of safety assessments, especially in scenarios involving jaywalking, which is governed by latent personality traits that the vehicle cannot observe. We hypothesize that jointly training pedestrians and the SDC with multi-agent reinforcement learning (MARL) produces more realistic interaction scenarios than training the SDC against fixed pedestrian policies, and that the resulting behavior gap between predictable and unpredictable crossings can be measured directly from trajectories. This paper describes a MARL environment in which an SDC and 12 pedestrians are co-trained using Multi-Agent Proximal Policy Optimization (MAPPO). Pedestrian locomotion follows scripted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.