Portfolio Reinforcement Learning with Scenario-Context Rollout

Vanya Priscillia Bendatu; Yao Lu

arXiv:2602.24037·cs.AI·March 2, 2026

Portfolio Reinforcement Learning with Scenario-Context Rollout

Vanya Priscillia Bendatu, Yao Lu

PDF

Open Access

TL;DR

This paper introduces a scenario-context rollout method for portfolio reinforcement learning that generates stress-test scenarios to improve policy stability and performance during market regime shifts, significantly enhancing Sharpe ratios and reducing drawdowns.

Contribution

It proposes a novel counterfactual rollout approach to stabilize RL critic training and effectively incorporate stress scenarios in portfolio management.

Findings

01

Up to 76% improvement in Sharpe ratio

02

Up to 53% reduction in maximum drawdown

03

Effective across 31 diverse market universes

Abstract

Market regime shifts induce distribution shifts that can degrade the performance of portfolio rebalancing policies. We propose macro-conditioned scenario-context rollout (SCR) that generates plausible next-day multivariate return scenarios under stress events. However, doing so faces new challenges, as history will never tell what would have happened differently. As a result, incorporating scenario-based rewards from rollouts introduces a reward--transition mismatch in temporal-difference learning, destabilizing RL critic training. We analyze this inconsistency and show it leads to a mixed evaluation target. Guided by this analysis, we construct a counterfactual next state using the rollout-implied continuations and augment the critic agent's bootstrap target. Doing so stabilizes the learning and provides a viable bias-variance tradeoff. In out-of-sample evaluations across 31…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Advanced Bandit Algorithms Research · Financial Markets and Investment Strategies