Optimal Execution with Reinforcement Learning
Yadh Hafsi, Edoardo Vittori

TL;DR
This paper develops a reinforcement learning-based optimal execution strategy for trading, utilizing a custom MDP and high-frequency data to outperform standard methods in simulated market environments.
Contribution
It introduces a novel RL framework for optimal trade execution using a multi-agent simulator and custom MDP formulation, advancing practical trading strategies.
Findings
RL agent outperforms standard execution strategies
High-frequency data improves decision accuracy
Simulation results validate the approach's effectiveness
Abstract
This study investigates the development of an optimal execution strategy through reinforcement learning, aiming to determine the most effective approach for traders to buy and sell inventory within a finite time horizon. Our proposed model leverages input features derived from the current state of the limit order book and operates at a high frequency to maximize control. To simulate this environment and overcome the limitations associated with relying on historical data, we utilize the multi-agent market simulator ABIDES, which provides a diverse range of depth levels within the limit order book. We present a custom MDP formulation followed by the results of our methodology and benchmark the performance against standard execution strategies. Results show that the reinforcement learning agent outperforms standard strategies and offers a practical foundation for real-world trading…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Radiation Effects in Electronics · Distributed systems and fault tolerance
