Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution
Tomas Espana, Yadh Hafsi, Fabrizio Lillo, Edoardo Vittori

TL;DR
This paper demonstrates that model-free Reinforcement Learning, specifically using a Double Deep Q-Network, can effectively optimize large order execution by adapting to market conditions and outperforming traditional methods.
Contribution
It introduces a novel application of Reinforcement Learning with a Queue-Reactive Model for realistic limit order book simulation in optimal execution tasks.
Findings
RL agent outperforms benchmarks in simulations
The approach adapts to dynamic market conditions
Model-free RL provides robust execution strategies
Abstract
We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Reinforcement Learning in Robotics · Simulation Techniques and Applications
