Deep reinforcement learning for optimal trading with partial information
Andrea Macr\`i, Sebastian Jaimungal, Fabrizio Lillo

TL;DR
This paper introduces three deep reinforcement learning algorithms that incorporate recurrent neural networks to extract latent information from market signals with regime-switching dynamics, improving trading strategy performance.
Contribution
It develops and compares three DDPG-based algorithms with RNNs for optimal trading under partial information, highlighting the benefits of probabilistic regime filtering.
Findings
prob-DDPG achieves higher cumulative rewards
Probabilistic regime filtering enhances robustness
Intermediate performance of hid-DDPG
Abstract
Reinforcement Learning (RL) applied to financial problems has been the subject of a lively area of research. The use of RL for optimal trading strategies that exploit latent information in the market is, to the best of our knowledge, not widely tackled. In this paper we study an optimal trading problem, where a trading signal follows an Ornstein-Uhlenbeck process with regime-switching dynamics. We employ a blend of RL and Recurrent Neural Networks (RNN) in order to make the most at extracting underlying information from the trading signal with latent parameters. The latent parameters driving mean reversion, speed, and volatility are filtered from observations of the signal, and trading strategies are derived via RL. To address this problem, we propose three Deep Deterministic Policy Gradient (DDPG)-based algorithms that integrate Gated Recurrent Unit (GRU) networks to capture temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Advanced Bandit Algorithms Research · Complex Systems and Time Series Analysis
