Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study

Yilie Huang; Yanwei Jia; Xun Yu Zhou

arXiv:2412.16175·q-fin.PM·March 31, 2026

Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study

Yilie Huang, Yanwei Jia, Xun Yu Zhou

PDF

TL;DR

This paper introduces a reinforcement learning approach for continuous-time mean-variance portfolio selection that learns investment strategies directly from data without estimating market coefficients, demonstrating strong empirical performance.

Contribution

It develops a novel data-driven RL algorithm for portfolio optimization in continuous time, with theoretical regret guarantees and extensive empirical validation.

Findings

01

The RL strategy outperforms traditional methods in volatile markets.

02

The algorithm achieves a sublinear regret bound in terms of the Sharpe ratio.

03

Empirical results show consistent outperformance on S&P 500 data.

Abstract

We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes, yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL approach that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise an algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of the Sharpe ratio. We then carry out an extensive empirical study implementing this algorithm to compare its performance and trading characteristics, evaluated under a host of common metrics, with a large number of widely employed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.