QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE
Junjie Zhao, Chengxi Zhang, Min Qin, Peng Yang

TL;DR
This paper introduces a variance-bounded REINFORCE algorithm for mining interpretable, steady alpha factors in financial markets, addressing high variance issues of previous methods and improving correlation with asset returns.
Contribution
It proposes a novel REINFORCE-based reinforcement learning framework with a new baseline and reward shaping for more stable and effective alpha factor mining.
Findings
Boosts correlation with returns by 3.83%
Achieves stronger excess returns than recent methods
Effectively reduces variance in policy gradient estimates
Abstract
Alpha factor mining aims to discover investment signals from the historical financial market data, which can be used to predict asset returns and gain excess profits. Powerful deep learning methods for alpha factor mining lack interpretability, making them unacceptable in the risk-sensitive real markets. Formulaic alpha factors are preferred for their interpretability, while the search space is complex and powerful explorative methods are urged. Recently, a promising framework is proposed for generating formulaic alpha factors using deep reinforcement learning, and quickly gained research focuses from both academia and industries. This paper first argues that the originally employed policy training method, i.e., Proximal Policy Optimization (PPO), faces several important issues in the context of alpha factors mining. Herein, a novel reinforcement learning algorithm based on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEntropy Regularization · Proximal Policy Optimization · REINFORCE
