From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement   Learning with Contextual Information

Zhendong Shi; Xiaoli Wei; Ercan E. Kuruoglu

arXiv:2310.00642·cs.LG·October 3, 2023

From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement Learning with Contextual Information

Zhendong Shi, Xiaoli Wei, Ercan E. Kuruoglu

PDF

Open Access

TL;DR

This paper enhances reinforcement learning for financial trading by integrating contextual information through Thompson sampling and supervised RL, combined with CPPI, to improve decision-making speed and effectiveness in dynamic markets.

Contribution

It introduces methods to incorporate contextual information into RL for trading and merges CPPI with DDPG to improve convergence in financial applications.

Findings

01

Both methods accelerate reinforcement learning convergence.

02

Integration of CPPI improves trading strategy performance.

03

Enhanced RL approaches adapt better to market dynamics.

Abstract

The problem of how to take the right actions to make profits in sequential process continues to be difficult due to the quick dynamics and a significant amount of uncertainty in many application scenarios. In such complicated environments, reinforcement learning (RL), a reward-oriented strategy for optimum control, has emerged as a potential technique to address this strategic decision-making issue. However, reinforcement learning also has some shortcomings that make it unsuitable for solving many financial problems, excessive resource consumption, and inability to quickly obtain optimal solutions, making it unsuitable for quantitative trading markets. In this study, we use two methods to overcome the issue with contextual information: contextual Thompson sampling and reinforcement learning under supervision which can accelerate the iterations in search of the best answer. In order to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Advanced Bandit Algorithms Research · Financial Markets and Investment Strategies