# Deep Q-Learning for Nash Equilibria: Nash-DQN

**Authors:** Philippe Casgrain, Brian Ning, Sebastian Jaimungal

arXiv: 1904.10554 · 2022-10-25

## TL;DR

This paper introduces Nash-DQN, a deep Q-learning algorithm designed to efficiently learn Nash equilibria in general-sum stochastic games, overcoming limitations of prior methods restricted to simpler game types.

## Contribution

The paper presents a novel deep Q-learning approach that uses a local linear-quadratic expansion to learn Nash equilibria in complex multi-agent stochastic games.

## Key findings

- Successfully applied to learning optimal trading strategies in electronic markets.
- Demonstrates data efficiency and flexibility in complex game environments.
- Extends reinforcement learning to general-sum stochastic games beyond zero-sum cases.

## Abstract

Model-free learning for multi-agent stochastic games is an active area of research. Existing reinforcement learning algorithms, however, are often restricted to zero-sum games, and are applicable only in small state-action spaces or other simplified settings. Here, we develop a new data efficient Deep-Q-learning methodology for model-free learning of Nash equilibria for general-sum stochastic games. The algorithm uses a local linear-quadratic expansion of the stochastic game, which leads to analytically solvable optimal actions. The expansion is parametrized by deep neural networks to give it sufficient flexibility to learn the environment without the need to experience all state-action pairs. We study symmetry properties of the algorithm stemming from label-invariant stochastic games and as a proof of concept, apply our algorithm to learning optimal trading strategies in competitive electronic markets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.10554/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1904.10554/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1904.10554/full.md

---
Source: https://tomesphere.com/paper/1904.10554