Balancing Two-Player Stochastic Games with Soft Q-Learning

Jordi Grau-Moya; Felix Leibfried; Haitham Bou-Ammar

arXiv:1802.03216·cs.AI·January 9, 2019

Balancing Two-Player Stochastic Games with Soft Q-Learning

Jordi Grau-Moya, Felix Leibfried, Haitham Bou-Ammar

PDF

TL;DR

This paper extends soft Q-learning to two-player stochastic games, enabling tunable strategies that balance competitive and cooperative behaviors, with theoretical guarantees and empirical demonstrations using neural networks.

Contribution

It introduces a generalized soft Q-learning framework for stochastic games, allowing adjustable strategic behavior and providing theoretical analysis and neural network-based implementations.

Findings

01

Games exhibit a unique value under soft Q-learning.

02

The framework generalizes team and zero-sum games across a spectrum.

03

Tuning constraints affects agent performance and balance.

Abstract

Within the context of video games the notion of perfectly rational agents can be undesirable as it leads to uninteresting situations, where humans face tough adversarial decision makers. Current frameworks for stochastic games and reinforcement learning prohibit tuneable strategies as they seek optimal performance. In this paper, we enable such tuneable behaviour by generalising soft Q-learning to stochastic games, where more than one agent interact strategically. We contribute both theoretically and empirically. On the theory side, we show that games with soft Q-learning exhibit a unique value and generalise team games and zero-sum games far beyond these two extremes to cover a continuous spectrum of gaming behaviour. Experimentally, we show how tuning agents' constraints affect performance and demonstrate, through a neural network architecture, how to reliably balance games with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning