A Deep Policy Inference Q-Network for Multi-Agent Systems

Zhang-Wei Hong; Shih-Yang Su; Tzu-Yun Shann; Yi-Hsiang Chang; and; Chun-Yi Lee

arXiv:1712.07893·cs.AI·April 10, 2018·43 cites

A Deep Policy Inference Q-Network for Multi-Agent Systems

Zhang-Wei Hong, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, and, Chun-Yi Lee

PDF

Open Access

TL;DR

This paper introduces DPIQN and DRPIQN, deep reinforcement learning models that infer other agents' policies to improve decision-making in multi-agent systems, demonstrating superior performance in competitive and cooperative scenarios.

Contribution

The paper proposes novel deep policy inference Q-networks that incorporate inferred policy features, enhancing multi-agent learning under varying strategies and partial observability.

Findings

01

DPIQN and DRPIQN outperform baseline DQN and DRQN in soccer simulations.

02

Models adapt well to dynamic policy changes of collaborators and opponents.

03

Enhanced stability and higher mean scores achieved in multi-agent tasks.

Abstract

We present DPIQN, a deep policy inference Q-network that targets multi-agent systems composed of controllable agents, collaborators, and opponents that interact with each other. We focus on one challenging issue in such systems---modeling agents with varying strategies---and propose to employ "policy features" learned from raw observations (e.g., raw images) of collaborators and opponents by inferring their policies. DPIQN incorporates the learned policy features as a hidden vector into its own deep Q-network (DQN), such that it is able to predict better Q values for the controllable agents than the state-of-the-art deep reinforcement learning models. We further propose an enhanced version of DPIQN, called deep recurrent policy inference Q-network (DRPIQN), for handling partial observability. Both DPIQN and DRPIQN are trained by an adaptive training procedure, which adjusts the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Machine Learning and ELM

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network