Zero-Shot MARL Benchmark in the Cyber-Physical Mobility Lab

Julius Beerwerth; Jianye Xu; Simon Sch\"afer; Fynn Belderink; Bassam Alrifaee

arXiv:2601.16578·cs.RO·May 12, 2026

Zero-Shot MARL Benchmark in the Cyber-Physical Mobility Lab

Julius Beerwerth, Jianye Xu, Simon Sch\"afer, Fynn Belderink, Bassam Alrifaee

PDF

TL;DR

This paper introduces a reproducible benchmark for evaluating zero-shot transfer of MARL policies for autonomous vehicles across simulation, digital twin, and real hardware, highlighting key challenges.

Contribution

It presents an open-source platform integrating simulation and real-world testing for systematic analysis of sim-to-real transfer in MARL for CAVs.

Findings

01

Identified architectural differences as a source of performance degradation.

02

Demonstrated performance gap increases with environmental realism.

03

Showcased the platform's utility for systematic analysis.

Abstract

We present a reproducible benchmark for evaluating sim-to-real transfer of Multi-Agent Reinforcement Learning (MARL) policies for Connected and Automated Vehicles (CAVs). The platform, based on the Cyber-Physical Mobility Lab (CPM Lab) [1], integrates simulation, a high-fidelity digital twin, and a physical testbed, enabling structured zero-shot evaluation of MARL motion-planning policies. We demonstrate its use by deploying a SigmaRL-trained policy [2] across all three domains, revealing two complementary sources of performance degradation: architectural differences between simulation and hardware control stacks, and the sim-to-real gap induced by increasing environmental realism. The open-source setup enables systematic analysis of sim-to-real challenges in MARL under realistic, reproducible conditions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.