A Deep Policy Inference Q-Network for Multi-Agent Systems
Zhang-Wei Hong, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, and, Chun-Yi Lee

TL;DR
This paper introduces DPIQN and DRPIQN, deep reinforcement learning models that infer other agents' policies to improve decision-making in multi-agent systems, demonstrating superior performance in competitive and cooperative scenarios.
Contribution
The paper proposes novel deep policy inference Q-networks that incorporate inferred policy features, enhancing multi-agent learning under varying strategies and partial observability.
Findings
DPIQN and DRPIQN outperform baseline DQN and DRQN in soccer simulations.
Models adapt well to dynamic policy changes of collaborators and opponents.
Enhanced stability and higher mean scores achieved in multi-agent tasks.
Abstract
We present DPIQN, a deep policy inference Q-network that targets multi-agent systems composed of controllable agents, collaborators, and opponents that interact with each other. We focus on one challenging issue in such systems---modeling agents with varying strategies---and propose to employ "policy features" learned from raw observations (e.g., raw images) of collaborators and opponents by inferring their policies. DPIQN incorporates the learned policy features as a hidden vector into its own deep Q-network (DQN), such that it is able to predict better Q values for the controllable agents than the state-of-the-art deep reinforcement learning models. We further propose an enhanced version of DPIQN, called deep recurrent policy inference Q-network (DRPIQN), for handling partial observability. Both DPIQN and DRPIQN are trained by an adaptive training procedure, which adjusts the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Machine Learning and ELM
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
