Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management
Travon Lucius, Christian Koch Jr, Jacob Starling, Julia Zhu, Miguel Urena, Carrie Hu

TL;DR
This paper introduces a reinforcement learning framework for dynamic option hedging that accounts for transaction costs and limits, demonstrating improved risk-adjusted returns over traditional methods.
Contribution
It develops a practical RL-based hedging approach using a leak-free environment and stochastic actor-critic agent, extending deep hedging to real-world trading constraints.
Findings
RL hedging outperforms no-hedge and baseline strategies in Sharpe ratio.
The learned policy maintains controlled turnover and is robust to increased transaction costs.
The framework is extensible for multi-asset and alternative risk objectives.
Abstract
We present a reinforcement-learning (RL) framework for dynamic hedging of equity index option exposures under realistic transaction costs and position limits. We hedge a normalized option-implied equity exposure (one unit of underlying delta, offset via SPY) by trading the underlying index ETF, using the option surface and macro variables only as state information and not as a direct pricing engine. Building on the "deep hedging" paradigm of Buehler et al. (2019), we design a leak-free environment, a cost-aware reward function, and a lightweight stochastic actor-critic agent trained on daily end-of-day panel data constructed from SPX/SPY implied volatility term structure, skew, realized volatility, and macro rate context. On a fixed train/validation/test split, the learned policy improves risk-adjusted performance versus no-hedge, momentum, and volatility-targeting baselines (higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Financial Markets and Investment Strategies · Stochastic processes and financial applications
