QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE

Junjie Zhao; Chengxi Zhang; Min Qin; Peng Yang

arXiv:2409.05144·q-fin.CP·June 18, 2025

QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE

Junjie Zhao, Chengxi Zhang, Min Qin, Peng Yang

PDF

TL;DR

This paper introduces a variance-bounded REINFORCE algorithm for mining interpretable, steady alpha factors in financial markets, addressing high variance issues of previous methods and improving correlation with asset returns.

Contribution

It proposes a novel REINFORCE-based reinforcement learning framework with a new baseline and reward shaping for more stable and effective alpha factor mining.

Findings

01

Boosts correlation with returns by 3.83%

02

Achieves stronger excess returns than recent methods

03

Effectively reduces variance in policy gradient estimates

Abstract

Alpha factor mining aims to discover investment signals from the historical financial market data, which can be used to predict asset returns and gain excess profits. Powerful deep learning methods for alpha factor mining lack interpretability, making them unacceptable in the risk-sensitive real markets. Formulaic alpha factors are preferred for their interpretability, while the search space is complex and powerful explorative methods are urged. Recently, a promising framework is proposed for generating formulaic alpha factors using deep reinforcement learning, and quickly gained research focuses from both academia and industries. This paper first argues that the originally employed policy training method, i.e., Proximal Policy Optimization (PPO), faces several important issues in the context of alpha factors mining. Herein, a novel reinforcement learning algorithm based on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsEntropy Regularization · Proximal Policy Optimization · REINFORCE