Deep reinforcement learning for optimal trading with partial information

Andrea Macr\`i; Sebastian Jaimungal; Fabrizio Lillo

arXiv:2511.00190·q-fin.TR·November 4, 2025

Deep reinforcement learning for optimal trading with partial information

Andrea Macr\`i, Sebastian Jaimungal, Fabrizio Lillo

PDF

Open Access

TL;DR

This paper introduces three deep reinforcement learning algorithms that incorporate recurrent neural networks to extract latent information from market signals with regime-switching dynamics, improving trading strategy performance.

Contribution

It develops and compares three DDPG-based algorithms with RNNs for optimal trading under partial information, highlighting the benefits of probabilistic regime filtering.

Findings

01

prob-DDPG achieves higher cumulative rewards

02

Probabilistic regime filtering enhances robustness

03

Intermediate performance of hid-DDPG

Abstract

Reinforcement Learning (RL) applied to financial problems has been the subject of a lively area of research. The use of RL for optimal trading strategies that exploit latent information in the market is, to the best of our knowledge, not widely tackled. In this paper we study an optimal trading problem, where a trading signal follows an Ornstein-Uhlenbeck process with regime-switching dynamics. We employ a blend of RL and Recurrent Neural Networks (RNN) in order to make the most at extracting underlying information from the trading signal with latent parameters. The latent parameters driving mean reversion, speed, and volatility are filtered from observations of the signal, and trading strategies are derived via RL. To address this problem, we propose three Deep Deterministic Policy Gradient (DDPG)-based algorithms that integrate Gated Recurrent Unit (GRU) networks to capture temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Advanced Bandit Algorithms Research · Complex Systems and Time Series Analysis