RL-Exec: Impact-Aware Reinforcement Learning for Opportunistic Optimal Liquidation, Outperforms TWAP and a Book-Liquidity VWAP on BTC-USD Replays
Enzo Duflot, Stanislas Robineau

TL;DR
RL-Exec, a reinforcement learning agent trained on historical BTC-USD order book data, significantly outperforms traditional execution benchmarks like TWAP and VWAP in optimal liquidation tasks over fixed deadlines.
Contribution
Introduces RL-Exec, a PPO-based reinforcement learning approach that incorporates market impact, microstructure, and latency for improved liquidation performance.
Findings
RL-Exec outperforms TWAP and VWAP baselines in test scenarios.
Performance gap widens with longer execution horizons.
Statistically significant improvements confirmed by rigorous testing.
Abstract
We study opportunistic optimal liquidation over fixed deadlines on BTC-USD limit-order books (LOB). We present RL-Exec, a PPO agent trained on historical replays augmented with endogenous transient impact (resilience), partial fills, maker/taker fees, and latency. The policy observes depth-20 LOB features plus microstructure indicators and acts under a sell-only inventory constraint to reach a residual target. Evaluation follows a strict time split (train: Jan-2020; test: Feb-2020) and a per-day protocol: for each test day we run ten independent start times and aggregate to a single daily score, avoiding pseudo-replication. We compare the agent to (i) TWAP and (ii) a VWAP-like baseline allocating using opposite-side order-book liquidity (top-20 levels), both executed on identical timestamps and costs. Statistical inference uses one-sided Wilcoxon signed-rank tests on daily RL-baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Markets and Investment Strategies · Supply Chain and Inventory Management · Auction Theory and Applications
