Markowitz Meets Bellman: Knowledge-distilled Reinforcement Learning for Portfolio Management
Gang Hu, Ming Gu

TL;DR
This paper presents KDD, a hybrid reinforcement learning approach for portfolio management that combines Markowitz's theory with knowledge distillation, achieving superior returns and risk management compared to traditional models.
Contribution
Introduces KDD, a novel two-stage training method integrating Markowitz theory with reinforcement learning via knowledge distillation for improved portfolio optimization.
Findings
KDD achieves the highest yield and Sharpe ratio of 2.03.
KDD outperforms standard financial and AI models in profitability and risk.
The method ensures top profitability with the lowest risk in comparable scenarios.
Abstract
Investment portfolios, central to finance, balance potential returns and risks. This paper introduces a hybrid approach combining Markowitz's portfolio theory with reinforcement learning, utilizing knowledge distillation for training agents. In particular, our proposed method, called KDD (Knowledge Distillation DDPG), consist of two training stages: supervised and reinforcement learning stages. The trained agents optimize portfolio assembly. A comparative analysis against standard financial models and AI frameworks, using metrics like returns, the Sharpe ratio, and nine evaluation indices, reveals our model's superiority. It notably achieves the highest yield and Sharpe ratio of 2.03, ensuring top profitability with the lowest risk in comparable return scenarios.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Financial Markets and Investment Strategies
MethodsKnowledge Distillation
