Balancing Profit, Risk, and Sustainability for Portfolio Management

Charl Maree; Christian W. Omlin

arXiv:2207.02134·q-fin.PM·July 6, 2022

Balancing Profit, Risk, and Sustainability for Portfolio Management

Charl Maree, Christian W. Omlin

PDF

TL;DR

This paper introduces a reinforcement learning-based portfolio optimization method that balances profit, risk, and sustainability by using a novel utility function and genetic algorithm optimization, outperforming existing approaches.

Contribution

It presents a new utility function combining Sharpe ratio and ESG scores and replaces gradient descent with a genetic algorithm for better optimization.

Findings

01

Outperforms MADDPG in portfolio optimization tasks

02

Improves on deep Q-learning by enabling continuous actions

03

Effectively incorporates risk and sustainability into the optimization process

Abstract

Stock portfolio optimization is the process of continuous reallocation of funds to a selection of stocks. This is a particularly well-suited problem for reinforcement learning, as daily rewards are compounding and objective functions may include more than just profit, e.g., risk and sustainability. We developed a novel utility function with the Sharpe ratio representing risk and the environmental, social, and governance score (ESG) representing sustainability. We show that a state-of-the-art policy gradient method - multi-agent deep deterministic policy gradients (MADDPG) - fails to find the optimum policy due to flat policy gradients and we therefore replaced gradient descent with a genetic algorithm for parameter optimization. We show that our system outperforms MADDPG while improving on deep Q-learning approaches by allowing for continuous action spaces. Crucially, by incorporating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Weight Decay · Convolution · Dense Connections · Batch Normalization · Q-Learning · Adam · MADDPG