Dirichlet policies for reinforced factor portfolios
Eric Andr\'e, Guillaume Coqueret

TL;DR
This paper introduces a reinforcement learning approach using Dirichlet policies to optimize factor portfolios, revealing that the agent tends to favor equally-weighted allocations due to the dynamic nature of factor return relationships.
Contribution
It develops a novel RL framework with Dirichlet policies for factor investing and derives analytical policy gradients for portfolio optimization.
Findings
RL portfolios closely resemble equal weighting.
Agents learn to be factor-agnostic due to time-varying return-characteristic relationships.
Analytical properties of Dirichlet-based policies are established.
Abstract
This article aims to combine factor investing and reinforcement learning (RL). The agent learns through sequential random allocations which rely on firms' characteristics. Using Dirichlet distributions as the driving policy, we derive closed forms for the policy gradients and analytical properties of the performance measure. This enables the implementation of REINFORCE methods, which we perform on a large dataset of US equities. Across a large range of parametric choices, our result indicates that RL-based portfolios are very close to the equally-weighted (1/N) allocation. This implies that the agent learns to be *agnostic* with regard to factors, which can partly be explained by cross-sectional regressions showing a strong time variation in the relationship between returns and firm characteristics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsREINFORCE
