An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution
Soohan Kim, Jimyeong Kim, Hong Kee Sul, Youngjoon Hong

TL;DR
This paper introduces a dual-level reinforcement learning strategy utilizing Transformer and LSTM models to accurately track daily VWAP in stock trading, addressing limitations of short-horizon models.
Contribution
It presents a novel dual-level approach combining Transformer and LSTM with PPO to improve VWAP tracking accuracy over longer trading horizons.
Findings
Enhanced accuracy in VWAP approximation compared to previous models
Effective modeling of intraday volume patterns using U-shaped volume distribution
Dual-level architecture outperforms single-level approaches in experiments
Abstract
The purpose of this research is to devise a tactic that can closely track the daily cumulative volume-weighted average price (VWAP) using reinforcement learning. Previous studies often choose a relatively short trading horizon to implement their models, making it difficult to accurately track the daily cumulative VWAP since the variations of financial data are often insignificant within the short trading horizon. In this paper, we aim to develop a strategy that can accurately track the daily cumulative VWAP while minimizing the deviation from the VWAP. We propose a method that leverages the U-shaped pattern of intraday stock trade volumes and use Proximal Policy Optimization (PPO) as the learning algorithm. Our method follows a dual-level approach: a Transformer model that captures the overall(global) distribution of daily volumes in a U-shape, and a LSTM model that handles the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Energy Load and Power Forecasting · Financial Markets and Investment Strategies
MethodsAttention Is All You Need · Layer Normalization · Label Smoothing · Linear Layer · Multi-Head Attention · Softmax · Tanh Activation · Dense Connections · Dropout · Sigmoid Activation
