PC3D: Zero-Shot Cooperation Across Variable Rosters via Personalized Context Distillation

Ahmet Onur Akman; Rafa{\l} Kucharski

arXiv:2605.10377·cs.LG·May 12, 2026

PC3D: Zero-Shot Cooperation Across Variable Rosters via Personalized Context Distillation

Ahmet Onur Akman, Rafa{\l} Kucharski

PDF

TL;DR

PC3D enables decentralized agents to adapt to varying team sizes in cooperative multi-agent reinforcement learning by distilling personalized coordination contexts from local histories.

Contribution

The paper introduces PC3D, a novel method for training decentralized policies that recover and utilize personalized coordination contexts without online retraining.

Findings

01

PC3D outperforms baselines on three MARL benchmarks.

02

It achieves higher returns with both seen and unseen team sizes.

03

Ablation studies confirm the importance of context distillation and adaptive use.

Abstract

Cooperative multi-agent reinforcement learning often assumes a fixed execution team, yet many decentralized systems must operate with varying numbers of active agents during deployment. We study this setting under episodic roster variation: each episode is executed by a set of homogeneous agents, with the team size varying across episodes. Agents act only from local histories, without execution-time communication, privileged coordinators, or online retraining. Therefore, effective cooperation requires each agent to recover relevant context about the active team and adapt its behavior accordingly. To this end, we propose PC3D (Personalized Central Coordination Context Distillation), a method for training decentralized policies to recover and use personalized coordination context from local interaction histories. During training, a set-structured centralized teacher compresses the active…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.