KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination
Yin Gu, Qi Liu, Zhi Li, Kai Zhang

TL;DR
KnowPC introduces an interpretable, logic-based programmatic approach to zero-shot coordination in cooperative AI, leveraging domain-specific language, environmental knowledge extraction, and reasoning to improve generalization and interpretability over neural network methods.
Contribution
This work presents a novel framework that combines knowledge extraction and reasoning to automatically learn interpretable program policies for zero-shot coordination.
Findings
Achieves improved generalization in zero-shot coordination tasks.
Provides interpretable policies based on logic and knowledge.
Outperforms neural network-based approaches in key metrics.
Abstract
Zero-shot coordination (ZSC) remains a major challenge in the cooperative AI field, which aims to learn an agent to cooperate with an unseen partner in training environments or even novel environments. In recent years, a popular ZSC solution paradigm has been deep reinforcement learning (DRL) combined with advanced self-play or population-based methods to enhance the neural policy's ability to handle unseen partners. Despite some success, these approaches usually rely on black-box neural networks as the policy function. However, neural networks typically lack interpretability and logic, making the learned policies difficult for partners (e.g., humans) to understand and limiting their generalization ability. These shortcomings hinder the application of reinforcement learning methods in diverse cooperative scenarios.We suggest to represent the agent's policy with an interpretable program.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
