Memory-Induced Supra-Competitive Outcomes Between Deep Reinforcement Learning Agents in Optimal Trade Execution
Christos Spyridon Koulouris, Carlo Campajola

TL;DR
This study explores how deep reinforcement learning agents in a shared trading environment can achieve outcomes better than traditional game-theoretic benchmarks, especially when utilizing memory and feedback.
Contribution
It demonstrates that feedback, memory, and state-dependent interactions significantly promote supra-competitive outcomes in optimal trade execution scenarios.
Findings
Intra-episode feedback and memory increase supra-competitive outcomes.
Agents with access to recent prices and past actions exhibit more persistent supra-competitive behavior.
Memory and feedback, not just current prices, drive supra-competitive outcomes.
Abstract
In this paper, we investigate whether deep reinforcement-learning agents interacting in a shared optimal-execution environment can sustain supra-competitive outcomes, in the sense of achieving lower implementation shortfalls than the relevant game-theoretical competitive benchmark. We study a two-agent Almgren-Chriss liquidation game and examine how learned behavior depends on intra-episode environment feedback, the ability to interpret the mid-price and the agent's knoledge of the past. We first use ex-ante schedule-learning agents to remove intra-episode feedback and isolate what can arise when agents commit to complete liquidation trajectories before execution begins. We then allow agents to condition on the evolving state using a variety of DDQN architectures. We find that, when agents are given access to intra-episode history, especially recent prices and own past actions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
