PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination
Xingzhou Lou, Jiaxian Guo, Junge Zhang, Jun Wang, Kaiqi Huang, Yali Du

TL;DR
This paper introduces PECAN, a policy ensemble approach that enhances zero-shot human-AI coordination by increasing partner diversity and enabling context-aware responses, leading to state-of-the-art results in Overcooked.
Contribution
The paper presents a novel policy ensemble and context-aware method to improve zero-shot coordination with diverse partners, surpassing existing approaches.
Findings
Significantly increased partner diversity in experiments.
Achieved state-of-the-art zero-shot coordination performance.
Enabled ego agents to learn more universal cooperative behaviors.
Abstract
Zero-shot human-AI coordination holds the promise of collaborating with humans without human data. Prevailing methods try to train the ego agent with a population of partners via self-play. However, these methods suffer from two problems: 1) The diversity of a population with finite partners is limited, thereby limiting the capacity of the trained ego agent to collaborate with a novel human; 2) Current methods only provide a common best response for every partner in the population, which may result in poor zero-shot coordination performance with a novel partner or humans. To address these issues, we first propose the policy ensemble method to increase the diversity of partners in the population, and then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives so that it can take different actions accordingly. In this way,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Reinforcement Learning in Robotics · Context-Aware Activity Recognition Systems
