Deviations from the Nash equilibrium in a two-player optimal execution game with reinforcement learning

Fabrizio Lillo; Andrea Macr\`i

arXiv:2408.11773·q-fin.TR·February 16, 2026

Deviations from the Nash equilibrium in a two-player optimal execution game with reinforcement learning

Fabrizio Lillo, Andrea Macr\`i

PDF

Open Access

TL;DR

This paper investigates how autonomous reinforcement learning agents in a financial market context deviate from traditional Nash equilibria, often achieving supra-competitive outcomes that resemble collusive behavior, influenced by market volatility.

Contribution

It demonstrates that reinforcement learning agents can learn strategies that deviate from Nash equilibrium, often resulting in Pareto-optimal or collusive-like solutions, under market impact conditions.

Findings

01

Agents deviate from Nash equilibrium strategies.

02

Learned strategies often exhibit supra-competitive, collusive-like behavior.

03

Market volatility influences the strategies and equilibria discovered.

Abstract

The use of reinforcement learning algorithms in financial trading is becoming increasingly prevalent. However, the autonomous nature of these algorithms can lead to unexpected outcomes that deviate from traditional game-theoretical predictions and may even destabilize markets. In this study, we examine a scenario in which two autonomous agents, modelled with Double Deep Q-Learning, learn to liquidate the same asset optimally in the presence of market impact, under the Almgren-Chriss (2000) framework. We show that the strategies learned by the agents deviate significantly from the Nash equilibrium of the corresponding market impact game. Notably, the learned strategies exhibit supra-competitive solution, {which might be compatible with a tacit collusive behaviour}, closely aligning with the Pareto-optimal solution. We further explore how different levels of market volatility influence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Game Theory and Applications

MethodsQ-Learning