Learning from Expert Factors: Trajectory-level Reward Shaping for Formulaic Alpha Mining

Junjie Zhao; Chengxi Zhang; Chenkai Wang; Peng Yang

arXiv:2507.20263·cs.LG·July 29, 2025

Learning from Expert Factors: Trajectory-level Reward Shaping for Formulaic Alpha Mining

Junjie Zhao, Chengxi Zhang, Chenkai Wang, Peng Yang

PDF

TL;DR

This paper introduces Trajectory-level Reward Shaping (TLRS), a novel method that enhances reinforcement learning for mining formulaic alpha factors by providing dense rewards and improving efficiency, leading to better predictive power in stock index experiments.

Contribution

The paper proposes TLRS, a new reward shaping technique that offers dense, subsequence-level rewards and reduces training variance, significantly improving RL-based alpha factor mining.

Findings

01

TLRS boosts the Rank IC by 9.29% over existing methods.

02

It reduces computational complexity from linear to constant.

03

Experiments on six stock indices validate its effectiveness.

Abstract

Reinforcement learning (RL) has successfully automated the complex process of mining formulaic alpha factors, for creating interpretable and profitable investment strategies. However, existing methods are hampered by the sparse rewards given the underlying Markov Decision Process. This inefficiency limits the exploration of the vast symbolic search space and destabilizes the training process. To address this, Trajectory-level Reward Shaping (TLRS), a novel reward shaping method, is proposed. TLRS provides dense, intermediate rewards by measuring the subsequence-level similarity between partially generated expressions and a set of expert-designed formulas. Furthermore, a reward centering mechanism is introduced to reduce training variance. Extensive experiments on six major Chinese and U.S. stock indices show that TLRS significantly improves the predictive power of mined factors,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.