Adversarial Deep Reinforcement Learning in Portfolio Management

Zhipeng Liang; Hao Chen; Junhao Zhu; Kangkang Jiang; Yanran Li

arXiv:1808.09940·q-fin.PM·November 20, 2018·57 cites

Adversarial Deep Reinforcement Learning in Portfolio Management

Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, Yanran Li

PDF

Open Access 5 Repos

TL;DR

This paper evaluates three deep reinforcement learning algorithms for portfolio management, introduces an adversarial training method, and demonstrates that policy gradient methods outperform others and UCRP in Chinese stock market simulations.

Contribution

It implements and compares DDPG, PPO, and PG algorithms in portfolio management, introduces an adversarial training method, and shows PG with adversarial training outperforms traditional methods and UCRP.

Findings

01

PG outperforms DDPG and PPO in Chinese stock market.

02

Adversarial training improves training efficiency and financial metrics.

03

Policy Gradient with adversarial training surpasses UCRP in back tests.

Abstract

In this paper, we implement three state-of-art continuous reinforcement learning algorithms, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO) and Policy Gradient (PG)in portfolio management. All of them are widely-used in game playing and robot control. What's more, PPO has appealing theoretical propeties which is hopefully potential in portfolio management. We present the performances of them under different settings, including different learning rates, objective functions, feature combinations, in order to provide insights for parameters tuning, features selection and data preparation. We also conduct intensive experiments in China Stock market and show that PG is more desirable in financial market than DDPG and PPO, although both of them are more advanced. What's more, we propose a so called Adversarial Training method and show that it can greatly improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Financial Markets and Investment Strategies · Advanced Bandit Algorithms Research

MethodsExperience Replay · Entropy Regularization · Dense Connections · Weight Decay · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Convolution · Batch Normalization · Deep Deterministic Policy Gradient · Proximal Policy Optimization